Bandwidth test – pinned versus pageable

By default, the memory allocation API known as malloc() allocates a memory type that's pageable. What this means is that, if needed, the memory that is mapped as pages can be swapped out by other applications or the OS itself. Hence, most devices, including GPUs and others such as InfiniBand, which also sit on the PCIe bus, expect the memory to be pinned before the transfer. By default, the GPU will not access the pageable memory. Hence, when a transfer of memory is invoked, the CUDA driver allocates the temporary pinned memory, copies the data from the default pageable memory to this temporary pinned memory, and then transfers it to the device via a Device Memory Controller (DMA).

This additional step ...

Get Learn CUDA Programming now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.