March 2026
Intermediate
534 pages
12h 51m
English
This chapter introduces CUDA streams, a powerful feature of the CUDA programming model designed to increase GPU performance by overlapping tasks. GPU operations by default are carried out sequentially. This sequential execution can lead to underutilization of GPU resources, as the GPU may remain idle while waiting for data transfers or computation to be completed. CUDA streams address this limitation by enabling multiple tasks to proceed concurrently. This is potentially useful for applications involving repetitive data processing or requiring real-time responsiveness. In this chapter, we begin with an overview of CUDA streams, then explore how to implement stream concurrency in Numba-CUDA. We will then ...
Read now
Unlock full access