O'Reilly logo

CUDA Fortran for Scientists and Engineers by Massimiliano Fatica, Gregory Ruetsch

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4

Multi-GPU Programming

Abstract

This chapter covers how to write code that utilizes multiple GPUs. Although there are many possible configurations between host processes and devices one can use in multi-GPU code, this chapter focuses on two configurations: (1) a single host process with multiple GPUs using CUDA’s peer-to-peer capabilities introduced in the 4.0 Toolkit, and (2) using MPI, where each MPI process uses a separate GPU. As an example of each of these approaches, we implement peer-to-peer and MPI multi-GPU versions of the transpose example used in the previous chapter.

Keywords

Peer-to-peer; Unified virtual addressing (UVA); Direct transfer; Direct access; Message Passing Interface (MPI); Compute mode; NVIDIA System Management ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required