Developing parallel applications with Open CASCADE. Part 1

by Roman Lygin - 19:50

As mentioned in an earlier post, I am now working on an ACIS importer for CAD Exchanger and am developing it to be parallel. The results are very promising so far (except STL streams parsing due to the Microsoft bug mentioned in that post, though I'm now installing VS2008SP1 to check if it has been fixed). So I decided to share my experience in this blog and hope it will be useful for other developers (questions on the forum about parallelism appear more frequently now). Also hope that Open CASCADE team will benefit from these findings.

As a quick note already made several time, let me underline that in multi-core era parallel applications will become mainstream in some mid-term and you better prepare yourself for that trend now. This can be good for your career path and these skills can be among your competitive advantages. Recently released Intel Parallel Studio eases work of developers to develop and to debug multi-threaded applications (it has become a vital part of my toolbox). There are good basic books on this subject. I am now reading "Patterns for Parallel Programming" by Timothy Mattson et al (recommended by one my skilled colleague), an analogue of famous "Design Patterns" by Erich Gamma (the must read book for any professional software developer). It helped me shape the architecture of the ACIS importer for CAD Exchanger.

Getting back to Open CASCADE, I can definitively say that it is quite usable for parallel applications but like any other software library requires precautions.

General comments.
You will get major gain if you look at your problem from a higher level, from the problem domain, and not from a particular algorithm perspective. Identify what can be done concurrently and what requires sequential execution. For instance, in my initial experiments with IGES translation (see here) I intentionally focused on parallelizing translation of components of IGES groups and this worked reasonably well. But overall gain for entire translation was very modest because it was not this part that took most part – it was Shape Healing that ate about 50%-70% of a total time, and it remained sequential. So, until you restructure your algorithms your improvements will be limited. Working on the ACIS reader I had to completely redesign traditional architecture of waterfall approach (when a model is traversed from a root entity down to a leaf) and consequent shape healing. This gave tremendous outcome. The Patterns book gives concrete recommendations that help identify ‘patterns' in your algorithms.

Handles
With 6.2.x Open CASCADE handles (hierarchy of Handle_Standard_Transient subclasses) come with thread-safe implementation for reference counting. To take advantage of that you must either define MMGT_REENTRANT system variable to non-null or call Standard::SetReentrant (Standard_True) before you start using handles across threads. This makes reference counter to rely on atomic increment/decrement (e.g. InterlockedIncrement() on Windows) instead of ++ which can be not atomic.

void Handle(Standard_Transient)::BeginScope()
{
if (entity != UndefinedHandleAddress)
{
if ( Standard::IsReentrant() )
Standard_Atomic_Increment (&entity->count);
else
entity->count++;
}
}

Memory management
By default, Open CASCADE ships with MMGT_OPT variable defined to 1. This forwards all Standard::Allocate() and ::Free() calls to the Open CASCADE memory manager (Standard_MMgrOpt) which optimizes memory allocation mitigating memory fragmentations. (Probably it deserves a separate post to describe insights of the memory manager.)

Standard_MMgrOpt is thread-safe itself and safely regulates simultaneous memory allocation/deallocation requests from several threads. However it is based on Standard_Mutex (more on it in the future) which in its current implementation introduces extreme overhead and makes this optimized memory management worthless in parallel applications (while in a single threaded environment it works just fine).

So, to overcome this deficiency you should use MMGT_OPT=0. It activates Standard_MMgrRaw that simply forward calls to malloc/free…

(to be continued...)

Tags :

10 comments

AnonymousAugust 19, 2009 at 3:39 PM
Hello Roman,

in a german MSDN Tech Talk by Microsoft and Intel about parallel programming, they told that before thinking about multi-threaded applications (which might be difficult) you can optimize the performance of your application just by compiling it with the Intel-compiler (Intel Parallel Composer) instead of the standard Visual Studio compiler, because the Intel compiler is able to do many optimizations, e.g. vectorization. Did you compile OCC with the Intel compiler and does it improve the performance?

The link to the MSDN Tech Talk:
http://www.microsoft.com/germany/msdn/techtalk/videos/library.aspx?id=msdn_de_33301

Regards,
Timo
ReplyDelete
Replies
Roman LyginAugust 19, 2009 at 7:56 PM
Hi Timo,
Glad to hear from you again. Well, I have often thought to give Intel compiler a try with Open CASCADE but did not come to this yet. It should involve conversion of VS projects to use Intel compiler and anyway would be a massive work that would require intensive regression testing. So, anyway it's huge effort and this could be in the best own OCC's interest to give it a try.
Thanks for a link although I don't understand German ;-). But there are many German readers here (more than from any other countries) so they would enjoy.

Intel compiler is indeed known to produce generally faster code (esp on Intel architecture) but I have not myself tried it yet :-(. On my todo list though...
ReplyDelete
Replies
AnonymousAugust 20, 2009 at 2:23 PM
In the talk it seemed quite simple to exchange the compilers because the Intel compiler integrates nicely in Viusual Studio (but maybe only in VS2010?).
ReplyDelete
Replies
Roman LyginAugust 20, 2009 at 7:04 PM
Oh no! Intel compiler integrates well into VS2005, 2008 and likely 2003.
ReplyDelete
Replies
spiceboyJune 18, 2014 at 1:17 PM
Hi Roman,

I have read two IGES Files it s stored in Shape_1 and Shape_2 which is of type TopoDS_Shape. Right now i need to translate shape_1 from point A in shape one to point B in Shape_2, i found that gp_Trsf can be used for translation, But i am not getting how which function can be used to perform the functionlaity and how to apply the translation to TopoDS_Shape Shape_1????
ReplyDelete
Replies
Roman LyginJune 18, 2014 at 1:27 PM
Well, not the best post to attach this comment to ;-).
gp_Trsf aT;
aT.SetTranslation (p1, p2);
TopoDS_Shape aShape1 = ...
aShape1.Move (aT);
ReplyDelete
Replies
spiceboyJune 19, 2014 at 12:23 PM
Hi Roman,

Sorry coudn't find where to post last but not least, Similary can you tell me how to perform Rotation and scaling??
ReplyDelete
Replies
Roman LyginJune 21, 2014 at 1:51 AM
Hi spiceboy,
Everything works just fine as expected. I used BRepPrimAPI_MakeBox to create myshape and save results to the .brep files:
TopoDS_Shape myShape = BRepPrimAPI_MakeBox (10., 20., 30.);
gp_Trsf aT1,aT2;
gp_Pnt p1;
gp_Pnt p2;
gp_Pnt p3;
p1.SetCoord(-20.620314,0.10587443,-146.79472);
p2.SetCoord(-435.57297,259.86014,1649.6030);
p3.SetCoord(-373.50610,246.80386,1852.6418);
aT1.SetTranslation(p1,p2);
myShape.Move(aT1);
BRepTools::Write (myShape, "C:/temp/myshape2.brep");
aT2.SetScale(p3,2.5);
BRepBuilderAPI_Transform Brep_Trsf(myShape,aT2,Standard_True);
TopoDS_Shape T_Shape = Brep_Trsf.Shape();
BRepTools::Write (T_Shape, "C:/temp/t_shape.brep");

So you might want to check your verification code.
Good luck.
Roman
ReplyDelete
Replies