Extreme fight with Extrema. Part 2
(continued)
*sqrt()*
The first one of such was sqrt(), which was among top 5 hotspots. As it is a system function, I did not notice it until pressed a button on the Amplifier toolbar (the matter of fact is that by default Amplifier attributes system functions time to its callers).
After unwinding its stacks I found the reason. Extrema used gp_Pnt::Distance() (or sometimes gp_Vec::Magnitude()) everywhere both to compute exact distances and to find out if one point is closer than another. Well, this had a price for such an abuse – 6.5% of total CPU time. Without sacrificing any correctness, distances can be replaced with square distances. This is a hint for you – try to use SquareDistance() or SquareModulus() wherever possible instead of their countparts invoking sqrt(), if you have multiple calculations. So after made modifications, I managed to save about 5-6% (I did not eliminate all the calls).
*IsCacheValid()*
Another finer improvement related to Geom_BSplineCurve. As we already noticed and will see more precisely below, calculations of points along B-Spline curves were intensively performed. This involves Geom_BSplineCurve::IsCacheValid() (to check if a local cache can be used for faster computations). Its time was 6.25%, and looking at the source and assembly code helped to understand why:
Reason – division (the fdiv instruction) by spinlenghtcache to normalize an original range into [0, 1). Actually it could be easily avoided and I simplified the code also eliminating multiple branching to increase readability (and possibly performance, as branching is expensive). Compare:
This gave about 60% time reduction of this function. Well, floating point operations are expensive, and even if subtraction is not as expensive as division, it still takes time.
*Calculations of B-Spline curve points*
Unwinding the stacks of the top hotspot PLib::NoDerivativeEvalPolynomial() helped locate the place where it was called from – Geom_BSplineCurve::D0() – which in its turn was called from Extrema (remember above mentioned tons of calls from inside ?). I put a counter and measured that number of calls was 272,3 millions ! Wow !
As nothing in the method itself promised any improvements, I had to made modifications upstream. In Extrema.
*Extrema*
Until now Extrema architecture regarding curve-curve case was very inflexible. It always redid all computations flushing out results of any previous call. On the other hand, point-surface and curve-surface has some caching, so I wonder why developers missed it for this case.
Anyway, I extended the API to be able to first set every curve and/or its ranges and later perform calculations. To take advantage of that, I also had to modify IntTools_BeanBeanIntersector::ComputeUsingExtrema() to set curve for theRange2 (specified as its argument) only once. This was a first step and reduced number of calls to Geom_BSplineCurve::D0() / PLib:: PLib::NoDerivativeEvalPolynomial() to 8.6M.
(To be continued)
0 comments