November 2016
Intermediate to advanced
576 pages
18h 22m
English
Isaac Gelado and Javier Cabezas
This chapter introduces joint MPI/CUDA programming. It presents a sufficient number of basic MPI concepts for the reader to understand a simple MPI/CUDA program. It then focuses on the practical use of pinned memory and asynchronous data transfers to enable overlapping computation with communication. The chapter ends with an overview of how CUDA-aware MPI systems help simplify the code and improve efficiency.
Message passing interface; message passing; communication; overlapping communication with computation; asynchronous; domain partition; collective; point-to-point communication; pinned memory; CUDA streams; barrier