Chapter 4: Synchronization

Abstract

This chapter covers language features that deal with synchronization, both between operations on the host and device and between GPU threads running within a kernel.

Keywords

Synchronization; CUDA Fortran language; Kernel execution; Data transfers; Streams; Asynchronous transfers; Synchronization barriers; Warps

In parallel programming, we want to have as many operations as possible executing simultaneously. The parallelism is built into the CUDA programming model by launching kernels with many independent threads. There are occasions, however, where operations are not completely independent: certain operations must complete before others can start, and sometimes it is advantageous to share data between otherwise ...

Get CUDA Fortran for Scientists and Engineers, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

CUDA Fortran for Scientists and Engineers, 2nd Edition by Gregory Ruetsch, Massimiliano Fatica

Chapter 4: Synchronization

Abstract

Keywords

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly