Chapter 8

Advanced CUDA Programming

Abstract

In the recent past, CUDA has become the major framework for the programming of massively parallel accelerators. NVIDIA estimates the number of CUDA installations in the year 2016 to exceed one million. Moreover, with the rise of Deep Learning this number is expected to grow at an exponential rate in the foreseeing future. Hence, extensive CUDA knowledge is a fundamental pursuit for every programmer in the field of High Performance Computing. The previous chapter focused on the basic programming model and the memory hierarchy of modern GPUs. We have seen that proper memory utilization is key to obtain efficient code. While our examples from the previous chapter focused on thread-level implementations, ...

Get Parallel Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.