7.6 LOOP UNROLLING
Loop unrolling transforms a loop into a sequence of statements. It is a parallelizing and optimizing compiler technique [29] where loop unrolling us used to eliminate loop overhead to test loop control flow such as loop index values and termination conditions. The technique was also used to expose instruction-level parallelism [20]. Consider the loop shown in Listing 7.4 [20]:
Listing 7.4 Exposing potential parallelism by loop unrolling
1: for i = 1:I do2: y(i) = y(i) + y(i − 5)3: end for
We note that the output version of the intermediate variable y(i) depends on its current value y(i) and a value that is distant 5, that is, y(i − 5). The loop can be unrolled to execute five statements in parallel as shown in Listing 7.5 [20].
Listing 7.5 Exposing potential parallelism by loop unrolling.
1: for i = 1:5:I do2: y(i) = y(i) + y(i - 5)3: y(i + 1) = y(i + 1) + y(i - 4)4: y(i + 2) = y(i + 2) + y(i - 3)5: y(i + 3) = y(i + 3) + y(i - 2)6: y(i + 4) = y(i + 4) + y(i - 1)7: end for
Now we can execute five statements of the loop at each iteration and gain a speedup ratio of 5.
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access