June 2013
Intermediate to advanced
528 pages
13h 11m
English
This chapter gives a detailed description of how kernels are executed on the GPU: how they are launched, their execution characteristics, how they are organized into grids of blocks of threads, and resource management considerations. The chapter concludes with a description of dynamic parallelism—the new CUDA 5.0 feature that enables CUDA kernels to launch work for the GPU.
CUDA kernels execute on the GPU and, since the very first version of CUDA, always have executed concurrently with the CPU. In other words, kernel launches are asynchronous: Control is returned to the CPU before the GPU has completed the requested operation. When CUDA was first introduced, there was no need for developers to concern ...