Chapter 6

Memory Handling with CUDA

Introduction

In the conventional CPU model we have what is called a linear or flat memory model. This is where any single CPU core can access any memory location without restriction. In practice, for CPU hardware, you typically see a level one (L1), level two (L2), and level three (L3) cache. Those people who have optimized CPU code or come from a high-performance computing (HPC) background will be all too familiar with this. For most programmers, however, it’s something they can easily abstract away.

Abstraction has been a trend in modern programming language, where the programmer is further and further removed from the underlying hardware. While this can lead to higher levels of productivity, as problems ...

Get CUDA Programming now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.