As mentioned in an earlier post, I am now working on an ACIS importer for CAD Exchanger and am developing it to be parallel. The results are very promising so far (except STL streams parsing due to the Microsoft bug mentioned in that post, though I'm now installing VS2008SP1 to check if it has been fixed). So I decided to share my experience in this blog and hope it will be useful for other developers (questions on the forum about parallelism appear more frequently now). Also hope that Open CASCADE team will benefit from these findings.
As a quick note already made several time, let me underline that in multi-core era parallel applications will become mainstream in some mid-term and you better prepare yourself for that trend now. This can be good for your career path and these skills can be among your competitive advantages. Recently released Intel Parallel Studio eases work of developers to develop and to debug multi-threaded applications (it has become a vital part of my toolbox). There are good basic books on this subject. I am now reading "Patterns for Parallel Programming" by Timothy Mattson et al (recommended by one my skilled colleague), an analogue of famous "Design Patterns" by Erich Gamma (the must read book for any professional software developer). It helped me shape the architecture of the ACIS importer for CAD Exchanger.
Getting back to Open CASCADE, I can definitively say that it is quite usable for parallel applications but like any other software library requires precautions.
You will get major gain if you look at your problem from a higher level, from the problem domain, and not from a particular algorithm perspective. Identify what can be done concurrently and what requires sequential execution. For instance, in my initial experiments with IGES translation (see here) I intentionally focused on parallelizing translation of components of IGES groups and this worked reasonably well. But overall gain for entire translation was very modest because it was not this part that took most part – it was Shape Healing that ate about 50%-70% of a total time, and it remained sequential. So, until you restructure your algorithms your improvements will be limited. Working on the ACIS reader I had to completely redesign traditional architecture of waterfall approach (when a model is traversed from a root entity down to a leaf) and consequent shape healing. This gave tremendous outcome. The Patterns book gives concrete recommendations that help identify ‘patterns' in your algorithms.
With 6.2.x Open CASCADE handles (hierarchy of Handle_Standard_Transient subclasses) come with thread-safe implementation for reference counting. To take advantage of that you must either define MMGT_REENTRANT system variable to non-null or call Standard::SetReentrant (Standard_True) before you start using handles across threads. This makes reference counter to rely on atomic increment/decrement (e.g. InterlockedIncrement() on Windows) instead of ++ which can be not atomic.
if (entity != UndefinedHandleAddress)
if ( Standard::IsReentrant() )
By default, Open CASCADE ships with MMGT_OPT variable defined to 1. This forwards all Standard::Allocate() and ::Free() calls to the Open CASCADE memory manager (Standard_MMgrOpt) which optimizes memory allocation mitigating memory fragmentations. (Probably it deserves a separate post to describe insights of the memory manager.)
Standard_MMgrOpt is thread-safe itself and safely regulates simultaneous memory allocation/deallocation requests from several threads. However it is based on Standard_Mutex (more on it in the future) which in its current implementation introduces extreme overhead and makes this optimized memory management worthless in parallel applications (while in a single threaded environment it works just fine).
So, to overcome this deficiency you should use MMGT_OPT=0. It activates Standard_MMgrRaw that simply forward calls to malloc/free…
(to be continued...)