Programming a heterogeneous computing cluster
Isaac Gelado and Javier Cabezas
Abstract
This chapter introduces joint MPI/CUDA programming. It presents a sufficient number of basic MPI concepts for the reader to understand a simple MPI/CUDA program. It then focuses on the practical use of pinned memory and asynchronous data transfers to enable overlapping computation with communication. The chapter ends with an overview of how CUDA-aware MPI systems help simplify the code and improve efficiency.
Keywords
Message passing interface; message passing; communication; overlapping communication with computation; asynchronous; domain partition; collective; point-to-point communication; pinned memory; CUDA streams; barrier
Get Programming Massively Parallel Processors, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.