March 2026
Intermediate
534 pages
12h 51m
English
In this chapter, we will explore how to use multiple GPUs when the limitations of a single GPU become a bottleneck. We will begin by highlighting the importance of multi-GPU computing, providing an overview of multi-GPU systems and two commonly used parallelism approaches. Next, we will present practical examples using Numba-CUDA to introduce core concepts such as data partitioning, memory data movement, and distributed computing. We will then scale these examples to a multi-node environment using a Dask cluster, which enables us to orchestrate computation across multiple GPUs and machines. Finally, we will demonstrate multi-GPU computing in JAX and how to perform distributed training of a machine learning model.
Read now
Unlock full access