Chapter 4

Multi-GPU Programming

Abstract

This chapter covers how to write code that utilizes multiple GPUs. Although there are many possible configurations between host processes and devices one can use in multi-GPU code, this chapter focuses on two configurations: (1) a single host process with multiple GPUs using CUDA’s peer-to-peer capabilities introduced in the 4.0 Toolkit, and (2) using MPI, where each MPI process uses a separate GPU. As an example of each of these approaches, we implement peer-to-peer and MPI multi-GPU versions of the transpose example used in the previous chapter.

Keywords

Peer-to-peer; Unified virtual addressing (UVA); Direct transfer; Direct access; Message Passing Interface (MPI); Compute mode; NVIDIA System Management ...

Get CUDA Fortran for Scientists and Engineers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

CUDA Fortran for Scientists and Engineers by Gregory Ruetsch, Massimiliano Fatica

Multi-GPU Programming

Abstract

Keywords

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly