32

–––––––––––––––––––––––

Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach

George Bosilca, Aurelien Bouteiller, Anthony Danalis, Thomas Herault, Piotr Luszczek, and Jack J. Dongara

32.1  INTRODUCTION AND MOTIVATION

Among the various factors that drive the momentous changes occurring in the design of microprocessors and high-end systems [1], three stand out as especially notable:

  1. The number of transistors per chip will continue the current trend, that is, double roughly every 18 months, while the speed of processor clocks will cease to increase.
  2. The physical limit on the number and bandwidth of the CPUs pins is becoming a near-term reality.
  3. A strong drift toward hybrid/heterogeneous systems for petascale (and larger) systems is taking place.

While the first two involve fundamental physical limitations that current technology trends are unlikely to overcome in the near term, the third is an obvious consequence of the first two, combined with the economic necessity of using many thousands of computational units to scale up to petascale and larger systems.

More transistors and slower clocks require multicore designs and an increased parallelism. The fundamental laws of traditional processor design—increasing transistor density, speeding up clock rate, lowering voltage—have now been stopped by a set of physical barriers: excess heat produced, too much power consumed, too much energy leaked, and useful signal overcome by noise. Multicore designs ...

Get Scalable Computing and Communications: Theory and Practice now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.