Chapter 18

Programming a heterogeneous computing cluster

Isaac Gelado and Javier Cabezas

Abstract

This chapter introduces joint MPI/CUDA programming. It presents a sufficient number of basic MPI concepts for the reader to understand a simple MPI/CUDA program. It then focuses on the practical use of pinned memory and asynchronous data transfers to enable overlapping computation with communication. The chapter ends with an overview of how CUDA-aware MPI systems help simplify the code and improve efficiency.

Keywords

Message passing interface; message passing; communication; overlapping communication with computation; asynchronous; domain partition; collective; point-to-point communication; pinned memory; CUDA streams; barrier

Get Programming Massively Parallel Processors, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.