Task Scheduler Summary
The task scheduler works most efficiently for fork-join parallelism with lots of forks so that the task stealing can cause sufficient breadth-first behavior to occupy threads, which then conduct themselves in a depth-first manner until they need to steal more work.
The task scheduler is not the simplest-possible scheduler because it is designed for speed. If you need to use it directly, it may be best to hide it behind a higher-level interface, such as the templates parallel_for, parallel_reduce, and so on. Some of the details to remember are:
Always use
new(allocation_method) Tto allocate a task, whereallocation_methodis one of the allocation methods of the classtask. Do not create local or file-scope instances of a task.Allocate all siblings before any of them start to run, unless you are using
allocate_additional_child_of.Exploit continuation passing, scheduler bypass, and task recycling to squeeze out maximum performance.
If a task completes and was not marked for reexecution, it is automatically destroyed. Also, its dependent’s reference count is decremented, and if it hits
0, the dependent is automatically spawned.