O'Reilly logo

The CUDA Handbook: A Comprehensive Guide to GPU Programming by Nicholas Wilt

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 11. Streaming Workloads

Streaming workloads are among the simplest that can be ported to CUDA: computations where each data element can be computed independently of the others, often with such low computational density that the workload is bandwidth-bound. Streaming workloads do not use many of the hardware resources of the GPU, such as caches and shared memory, that are designed to optimize reuse of data.

Since GPUs give the biggest benefits on workloads with high computational density, it might be useful to review some cases when it still makes sense for streaming workloads to port to GPUs.

• If the input and output are in device memory, it doesn’t make sense to transfer the data back to the CPU just to perform one operation.

• If the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required