Memory Allocators
The solution to the challenges of concurrent memory allocation is to use a scalable memory allocator, either in Intel Threading Building Blocks or in another third-party solution. The Threading Building Blocks scalable memory allocator utilizes a memory management algorithm divided on a per-thread basis to minimize contention associated with allocation from a single global heap.
Threading Building Blocks offers two choices, both similar to the STL template class, std::allocator:
-
scalable_allocator This template offers just scalability, but it does not completely protect against false sharing. Memory is returned to each thread from a separate pool, which helps protect against false sharing if the memory is not shared with other threads.
-
cache_aligned_allocator This template offers both scalability and protection against false sharing. It addresses false sharing by making sure each allocation is done on a cache line.
Note that protection against false sharing between two objects is guaranteed only if both are allocated with cache_aligned_allocator. For instance, if one object is allocated by cache_aligned_allocator<T> and another object is allocated some other way, there is no guarantee against false sharing.
The functionality of cache_aligned_allocator comes at some cost in space because it allocates in multiples of cache-line-size memory chunks, even for a small object. The padding is typically 128 bytes. Hence, allocating many small objects with cache_aligned_allocator ...