Chapter 4: Synchronization
Abstract
This chapter covers language features that deal with synchronization, both between operations on the host and device and between GPU threads running within a kernel.
Keywords
Synchronization; CUDA Fortran language; Kernel execution; Data transfers; Streams; Asynchronous transfers; Synchronization barriers; Warps
In parallel programming, we want to have as many operations as possible executing simultaneously. The parallelism is built into the CUDA programming model by launching kernels with many independent threads. There are occasions, however, where operations are not completely independent: certain operations must complete before others can start, and sometimes it is advantageous to share data between otherwise ...
Get CUDA Fortran for Scientists and Engineers, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.