2012-11-22

Standard allocator. Part 2

(continued)

Having reviewed available NCollection allocators, we can create a C++ standard compliant allocator that would wrap them.

Please have a look at the NCollection_StdAllocator.hxx which is available in the git repository, branch CR23569. Here is a corresponding tracker report.

You can feed it with any instance of NCollection_BaseAllocator and then provide it to any STL container, for instance:

    Handle(NCollection_IncAllocator) anIncAlloc = new NCollection_IncAllocator;
    NCollection_StdAllocator aSAlloc (anIncAlloc);
    std::list > aL (aSAlloc);
    TopoDS_Solid aSolid = BRepPrimAPI_MakeBox (10., 20., 30.);
    aL.push_back (aSolid);

Default constructor uses NCollection_BaseAllocator::CommonBaseAllocator() and thus redirects all allocation calls to Open CASCADE central allocator interface. The following will do exactly this:
    std::list > aL2;

This makes NCollection_StdAllocator a superset of Standard_StdAllocator suggested in Part1 and thus self-sufficient.

Moreover, as allocators for one object type can be casted to another type (part of the C++ standard requirement), you might end up using a single type:

typedef NCollection_StdAllocator AllocatorType;

AllocatorType anAlloc (new NCollection_IncAllocator);
std::list aL3 (anAlloc);
std::vector aV (anAlloc);

and also:
typedef std::basic_string, AllocatorType> StringType;
StringType aS1; //defaults to central allocator interface
StringType aS2 (anAlloc); //uses pool allocator
StringType aS3 (AllocatorType (NCollection_HeapAllocator::GlobalHeapAllocator())); //directly uses OS allocator and is effectively equivalent to std::string


Now, with NCollection_StdAllocator you can take advantage of available Open CASCADE allocators and use STL, Boost or any other containers or template classes that accept standard allocator. By the way, this flexibility enabled me to start gradually moving from NCollection to Boost, but that's a subject for a separate post ;-)...

NCollection allocators

The previous post has introduced a C++ standard compliant allocator that wraps Open CASCADE allocator interface (Standard::Allocate(), ::Free(), ::Reallocate()) and thus can be used in any containers or other template classes that accept such allocator.

Before we move forward to consider another use case of standard allocator, let's make a step aside and review another allocator flavors offered by Open CASCADE.

Those of you who happened to use templates from NCollection or just attentively read the source code employing it might notice NCollection_BaseAllocator base class. NCollection containers sort of tried to mimic interfaces of STL containers and as such introduced allocator in its inception in early 2000es. However, unlike STL where allocator is part of the type (e.g. std::list is a different type than std::list) NCollection allows to dynamically supply an allocator, for instance:

NCollection_List aVec1, aVec2 (new NCollection_IncAllocator);

There are three types of allocator in NCollection:
-    NCollection_BaseAllocator, which is a base class that also implements a wrapper over Standard::Allocate() and ::Free(). There is a singletone returned by NCollection_BaseAllocator::CommonBaseAllocator().
-    NCollection_HeapAllocator, which wraps standard OS allocator (malloc/free). Similar to a base class there is a singletone returned by NCollection_HeapAllocator::GlobalHeapAllocator().
-    NCollection_IncAllocator, which implements a pool allocator.

The former two are straightforward and probably only NCollection_IncAllocator deserves some details.

Pool allocators (see for instance Wikipedia) are most helpful to manage temporary objects, especially when their destruction can be deferred to some common point in time, e.g. completion of an algorithm that created them. Implementation of a pool allocator is very straightforward – it preallocates a large chunk of memory (e.g. from a system heap or another allocator), and whenever there is a new temporary object to be allocated, just takes a required size from that large chunk's head pointer and shifts that pointer. Deallocation is void, i.e. no memory gets really freed after the object destructor has been called. Real deallocation of large chunk(s) happens upon own pool allocator destruction.

This simple implementation allows to greatly improve performance when dealing with numerous temporary objects due to avoiding fragmentation and/or complicated techniques to manage individual smaller memory chunks.

Here is a usage example which creates a temporary hash table:
class PartnerShapeHasher
{
public:
    static Standard_Integer HashCode (const TopoDS_Shape& theKey, const Standard_Integer theUpper)
    {
        return ::HashCode (theKey.TShape(), theUpper);
    }
    static Standard_Boolean IsEqual(const TopoDS_Shape& theKey1, const TopoDS_Shape& theKey2)
    {
        return theKey1.IsPartner (theKey2);
    }
};

//! Returns a number of partner subshapes of a given type
/*! A partner shape is a shared instance regardless of a location (transformation matrix).*/
static size_t NumberOfPartnerSubshapes (const TopoDS_Shape& theShape, const TopAbs_ShapeEnum theType)
{
    NCollection_Map aMap (1 /*default number of buckets*/, new NCollection_IncAllocator);
    for (TopExp_Explorer anExp (theShape, theType); anExp.More(); anExp.Next()) {
        const TopoDS_Shape& aSubshape = anExp.Current();
        aMap.Add (aSubshape);
    }
    return aMap.Size();
}

If your temporary objects' life span is not necessarily aligned with the end of the algorithm scope (e.g. some objects are destroyed earlier) then you could either have a separate pool allocator for such objects or just accept a trade-off of temporary larger peak memory footprint.

NCollection_IncAllocator's constructor accepts a parameter theBlockSize which specifies a size of large memory chunk(s) preallocated from the heap. I ended up with having a few predefined sizes via the following enumeration:

enum NCollection_IncAllocator_Size {
    NCollection_IncAllocator_Tiny = 1000,
    NCollection_IncAllocator_Small = 4000,
    NCollection_IncAllocator_Medium = 8000,
    NCollection_IncAllocator_Large = 16000,
    NCollection_IncAllocator_Huge = 32000
};

and use respective value depending on the most probable case, e.g.:
Handle(NCollection_BaseAllocator) anAlloc = new NCollection_IncAllocator (NCollection_IncAllocator_Tiny);
NCollection_List aList (anAlloc);

This should help avoid extra fragmentatoin in the OS allocator due to reuse of the same size blocks over and over again, yet providing a hint instead of default constructor parameter (around 26K).

If the allocator is requested to allocate an object of a larger size than the constructor's theBlockSize parameter then a new larger chunk is allocated. Whenever a new large chunk is allocated then the remaining unused free bytes in a previous chunk are wasted. So some attention should be made to minimize unnecessary waste, if it can be substantial.

Another point to remember is that NCollection_IncAllocator is not thread-safe (though reentrant), so access to the same instance should be synchronized outside, if it is used from multiple threads. For thread-safe scalable implementations you might want to check Intel Threading Building Blocks memory pools.

Now with the review of NCollection allocator completed, let's get back to the standard interface.