7.5 LOOP SPREADING FOR SIMPLE DEPENDENT LOOPS
Consider the dependent loop shown in Listing 7.3, where s(i, j) is some statement or task to be executed.
Listing 7.3 1-D IIR digital filter algorithm
1: for i = 1:I do2: for j = 1:J do3: s(i, j) = f(s(i, j − 1))4: end for5: end for
where function evaluated by statement s(i, j) depends on s(i, j − 1). One way to distribute the tasks among the processors is to implement each iteration of the outer loop among I processors so that processor i implements all the iterations of the inner loop with its dependencies. We could increase the workload for each processor by allocating more than one iteration of the outer loop for each processor using a similar technique to the one explained in Section 7.3.
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access