November 2018
Intermediate to advanced
310 pages
7h 54m
English
In this chapter, we'll see how to write effective CUDA kernels. In GPU programming, a kernel (which we interchangeably use with terms such as CUDA kernel or kernel function) is a parallel function that can be launched directly from the host (the CPU) onto the device (the GPU), while a device function is a function that can only be called from a kernel function or another device function. (Generally speaking, device functions look and act like normal serial C/C++ functions, only they are running on the GPU and are called in parallel from kernels.)
We'll then get an understanding of how CUDA uses the notion of threads, blocks, and grids to abstract away some of the underlying technical details of the GPU ...
Read now
Unlock full access