Programming a heterogeneous computing cluster
An introduction to CUDA streams
With special contributions from Isaac Gelado and Javier Cabezas
Abstract
This chapter introduces joint Message Passing Interface (MPI)/CUDA programming using the stencil pattern. It presents a sufficient number of basic MPI concepts for the reader to understand a simple MPI/CUDA program. It then focuses on the practical use of pinned memory and asynchronous data transfers to enable overlapping computation with communication. The chapter ends with an overview of how CUDA-aware MPI systems help to simplify the code and improve efficiency.
Keywords
MPI; message passing; communication; overlapping communication with computation; asynchronous; domain partition; ...
Get Programming Massively Parallel Processors, 4th Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.