Lying, the telling of beautiful untrue things, is the proper aim of Art.
Oscar Wilde, “The Decay of Lying,” Intentions (1891)
The purpose of this chapter is to provide the bare minimum of background information about computer hardware to motivate the optimizations described in this book, to prevent readers going mad poring over 600-page processor handbooks. It takes a superficial look at processor architecture to extract some heuristics to guide optimization efforts. The very impatient reader might skip this chapter for now and return when other chapters refer to it, though the information here is important and useful.
Microprocessor devices in current use are incredibly diverse. They range from sub-$1 embedded devices with just a few thousand gates and clock rates below 1 MHz to desktop-class devices with billions of gates and gigahertz clocks. Mainframe computers can be the size of a large room, containing thousands of independent execution units and drawing enough electrical power to light a small city. It is tempting to believe that nothing connects this profusion of computing devices that would lead to usable generalizations. In reality, though, they have useful similarities. After all, if there were no similarities, it would not be possible to compile C++ code for the many processors for which there are compilers.
All computers in wide use execute instructions stored in memory. The instructions act upon data that is also ...