Chapter 6

Memory Handling with CUDA

Introduction

In the conventional CPU model we have what is called a linear or flat memory model. This is where any single CPU core can access any memory location without restriction. In practice, for CPU hardware, you typically see a level one (L1), level two (L2), and level three (L3) cache. Those people who have optimized CPU code or come from a high-performance computing (HPC) background will be all too familiar with this. For most programmers, however, it’s something they can easily abstract away.

Abstraction has been a trend in modern programming language, where the programmer is further and further removed from the underlying hardware. While this can lead to higher levels of productivity, as problems ...

Get CUDA Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.