Lazy copying
It can be useful to copy a data structure only when another thread steals a task. Example 9-12 uses the start_reduce::execute() method to implement parallel_reduce. The code forks the loop body object you provide only when the thread runs a stolen task. The forking permits the thief to run locally afterward until it is done and joins its result to the original thread’s result. Because the forks and joins incur some overhead, they are worth doing only when stealing occurs.
Example 9-12. parallel_reduce start_reduce::execute( ) method
template<typename Range, typename Body>
task* start_reduce<Range,Body>::execute() {
Body* body = my_body;
if( is_stolen_task() ) {
finish_reduce<Body>* p = static_cast<finish_type*>(parent() );
body = new(p->zombie_space.begin()) Body(*body,split());
my_body = p->right_zombie = body;
}
task* next_task = NULL;
if( !my_range.is_divisible() )
(*my_body)( my_range );
else {
finish_reduce<Body>& c =
*new(allocate_continuation()) finish_type(body);
recycle_as_child_of(c);
c.set_ref_count(2);
start_reduce& b =
*new(c.allocate_child()) start_reduce(Range(my_range,split()), body);
c.spawn(b);
next_task = this;
}
return next_task;
}
}The method task::is_stolen_task provides a way to detect stealing. It is called on a running task, typically by the task itself. Informally speaking, it returns true if the task is stolen. Formally, it returns true if the thread that owns the task is not the thread that owns the thread’s dependent. For the usual fork-join ...