Chapter 7

Multidevice programming with OpenACC

Jiri Kraus     NVIDIA GmbH, Würselen, DEU, Germany


The purpose of this chapter is to explain how to program multiple OpenACC devices to work cooperatively on a single problem.

At the end of this chapter the reader will have a basic understanding of:

 How to program multidevice systems or accelerated clusters with OpenACC using a single host thread, OpenMP, or MPI

 Coordinate the work of multiple devices using a domain decomposition strategy

 How to use the async clause to overlap computation and MPI communication

 How to use the NVIDIA® tools for MPI+OpenACC applications


Multidevice programming; OpenACC; Domain decomposition; GPU; CUDA-aware MPI; Debugging; ...

Get Parallel Programming with OpenACC now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.