O'Reilly logo

Advanced Backend Optimization by Sid Touati, Benoit de Dinechin

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Appendix 5

Loop Unroll Degree Minimization: Experimental Results

All our benchmarks have been cross-compiled on a regular Dell workstation, equipped with Intel(R) Core(TM)2 CPU of 2.4 GHz and Linux operating system (kernel version 2.6, 64 bits).

A5.1. Stand-alone experiments with single register types

This section presents full experiments on a stand-alone tool by considering a single register type only. Our stand-alone tool is independent of the compiler and processor architecture. We will demonstrate the efficiency of our loop minimization method for both unscheduled loops (as studied in section 11.4) and scheduled loops (as studied in section 11.6).

A5.1.1. Experiments with unscheduled loops

In this context, our stand-alone tool takes a data dependence graph (DDG) as input, just after a periodic register allocation done by SIRA, and applies a loop unrolling minimization (LUM).

A5.1.2. Results on randomly generated data dependence graphs

First, our stand-alone software generates the number of distinct reuse circuits k and their weights (μ1, …, μk). Afterwards, we calculate the number of remaining registers images and the loop unrolling degree ρ = lcm(μ1, …, μk). Finally, we apply our method for minimizing ρ.

We did extensive random generations on many configurations: we varied the number of available registers from 4 to 256, and we considered 10,000 random instances containing multiple ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required