Oversubscription

Getting the number of threads right is difficult. The threads you would create with a threading package are logical threads, which map onto the physical threads of the hardware. For computations that do not wait on external devices, the highest efficiency usually occurs when there is exactly one running logical thread per physical thread. Otherwise, there can be inefficiencies from any mismatch.

If there are not enough running logical threads to keep the physical threads working, the result is undersubscription or inefficiency. If there are more running logical threads than physical threads, this oversubscription usually leads to time-sliced execution of logical threads, which incurs overhead.

The Threading Building Blocks task scheduler avoids undersubscription and oversubscription by selecting the number of logical threads that will likely make the most efficient use of the underlying hardware. It maps tasks to logical threads in a way that tolerates interference by other threads from the same or other processes.

Nested parallelism makes oversubscription more likely because a nested subroutine has to do something very elaborate to check whether it is running within a parallel operation on a higher level. Coordinating the creation of new threads within independent threads is also complex.

It is important to try to take advantage of parallelism at all levels of nesting to avoid undersubscription. Unfortunately, with raw threads this is nontrivial. Threading Building ...

Get Intel Threading Building Blocks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.