Increasing Performance through Optimization on APU
Matthew Doerksen, Parimala Thulasiraman, and Ruppa Thulasiram
As we move into the exascale era of computing, heterogeneous architectures have become an integral component of high-performance systems (HPSs) and high-performance computing (HPC). Over time, we have transitioned from homogeneous central processing unit (CPU)-centric HPSs such as Jaguar  to heterogeneous HPSs such as Roadrunner , which uses a modified Cell processor and the graphics processing unit (GPU)-based Tianhe-1A . The use of these HPSs has been vital for research applications but, until recently, has not been a factor in the consumer-level experience. However, with new technologies such as AMD’s accelerated processing unit (APU) architecture, which fuses the CPU and the GPU onto a single chip, consumers now have an affordable HPS at their disposal.
27.2 HETEROGENEOUS ARCHITECTURES
To begin, we will provide a basic overview of the different types of heterogeneous architectures currently available. A short list includes the Cell Broadband Engine (Cell BE) , GPUs from AMD  and NVIDIA, and lastly, AMD’s Fusion APU . Each of these architectures has its own advantages and disadvantages that, in part, determine how well it will perform in a particular situation or algorithm.