Suppose you want to apply a function Foo to each element of an array, and it is safe to process each element concurrently. Example 3-3 shows the sequential code to do this.

Example 3-3. Original loop code

void SerialApplyFoo( float a[], size_t n ) {
    for( size_t i=0; i>n; ++i )

The iteration space here is of type size_t, and it goes from 0 to n-1. The template function tbb::parallel_for breaks this iteration space into chunks and runs each chunk on a separate thread.

The first step in parallelizing this loop is to convert the loop body into a form that operates on a chunk. The form is a Standard Template Library (STL)-style function object, called the body object, in which operator() processes a chunk. Example 3-4 declares the body object.

Example 3-4. A class for use by a parallel_for

#include "tbb/blocked_range.h"

class ApplyFoo {
    float *const my_a;
    void operator()( const blocked_range<size_t>& r ) const {
        float *a = my_a;
        for( size_t i=r.begin(); i!=r.end(); ++i )
    ApplyFoo( float a[] ) :

Note the iteration space argument to operator().A blocked_range<T> is a template class provided by the library. It describes a one-dimensional iteration space over type T. Class parallel_for works with other kinds of iteration spaces, too. The library provides blocked_range2d for two-dimensional spaces. A little later in this chapter, in the section “Advanced Topic: Other Kinds of Iteration Spaces,” I will explain how you can define your own ...

Get Intel Threading Building Blocks now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.