Optimizing unified memory using data prefetching

Now, let's look at an easier method, called data prefetching. One key thing about CUDA is that it provides different methods to the developer, starting from the easiest ones to the ones that require ninja programming skills. Data prefetching are basically hints to the driver to prefetch the data that we believe will be used in the device prior to its use. CUDA provides a prefetching API called cudaMemPrefetchAsync() for this purpose. To see its implementation, let's look at the unified_memory_prefetch.cu file, which we compiled earlier. A snapshot of this code is shown in the following code snippet:

// Allocate Unified Memory -- accessible from CPU or GPU cudaMallocManaged(&x, N*sizeof(float)); ...

Get Learn CUDA Programming now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.