The CUDA Handbook: A Comprehensive Guide to GPU Programming

Chapter 7. Kernel Execution

This chapter gives a detailed description of how kernels are executed on the GPU: how they are launched, their execution characteristics, how they are organized into grids of blocks of threads, and resource management considerations. The chapter concludes with a description of dynamic parallelism—the new CUDA 5.0 feature that enables CUDA kernels to launch work for the GPU.

7.1. Overview

CUDA kernels execute on the GPU and, since the very first version of CUDA, always have executed concurrently with the CPU. In other words, kernel launches are asynchronous: Control is returned to the CPU before the GPU has completed the requested operation. When CUDA was first introduced, there was no need for developers to concern ...

Get The CUDA Handbook: A Comprehensive Guide to GPU Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

The CUDA Handbook: A Comprehensive Guide to GPU Programming by Nicholas Wilt

Chapter 7. Kernel Execution

7.1. Overview

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly