ParallelMerge
The example in this section (Example 11-8) is more complex and requires a little familiarity with the Standard Template Library (STL) to fully understand. It shows the power of parallel_for beyond flat iteration spaces. The code performs a parallel merge of two sorted sequences. It works for any sequence with a random-access iterator. The algorithm operates recursively as follows:
If the sequences are too short for effective use of parallelism, it does a sequential merge. Otherwise, it performs steps 2–6.
It swaps the sequences if necessary so that the first sequence,
[begin1, end1), is at least as long as the second sequence,[begin2, end2).It sets
m1to the middle position in[begin1, end1). It calls the item at that locationkey.It sets
m2to wherekeywould fall in[begin2, end2).It merges
[begin1,m1)and[begin2,m2)to create the first part of the merged sequence.It merges
[m1,end1)and[m2,end2)to create the second part of the merged sequence.
The Intel Threading Building Blocks implementation of this algorithm uses the Range object to perform most of the steps. The predicate is_divisible performs the test in step 1, along with step 2. The splitting constructor performs steps 3–6. The body object does the sequential merges.
Example 11-8. Parallel merge
#include "tbb/parallel_for.h" #include <algorithm> using namespace tbb; template<typename Iterator> struct ParallelMergeRange { static size_t grainsize; Iterator begin1, end1; // [begin1,end1) is 1st sequence to be ...Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access