7.5 LOOP SPREADING FOR SIMPLE DEPENDENT LOOPS

Consider the dependent loop shown in Listing 7.3, where s(i, j) is some statement or task to be executed.

Listing 7.3 1-D IIR digital filter algorithm

1: for i = 1:I do2: for j = 1:J do3: s(i, j) = f(s(i, j − 1))4: end for5: end for

where function evaluated by statement s(i, j) depends on s(i, j − 1). One way to distribute the tasks among the processors is to implement each iteration of the outer loop among I processors so that processor i implements all the iterations of the inner loop with its dependencies. We could increase the workload for each processor by allocating more than one iteration of the outer loop for each processor using a similar technique to the one explained in Section 7.3.

Get Algorithms and Parallel Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.