O'Reilly logo

OpenGL Insights by Christophe Riccio, Patrick Cozzi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Asynchronous Buffer Transfers
Ladislav Hrabcak and Arnaud Masserann
28.1 Introduction
Most 3D applications send large quantities of data from the CPU to the GPU on a
regular basis. Possible reasons include
streaming data from hard drive or network: geometry, clipmapping, level of
detail (LOD), etc.;
updating skeletal and blend-shapes animations on the CPU;
computing a physics simulation;
generating procedural meshes;
data for instancing;
setting uniform parameters for shaders with uniform buffers.
Likewise, it is often useful to read generated data back from the GPU. Possible sce-
narios are
video capture [Kemen 10];
physics simulation;
page resolver pass in virtual texturing;
image histogram for computing HDR tonemapping parameters.
391
28
392 V Transfers
While copying data back and forth to the GPU is easy, the PC architecture,
withoutuniedmemory,makesithardertodoitfast. Furthermore,theOpenGL
API specification doesnt tell how to do it efficiently, and a naive use of data-transfer
functions wastes processing power on both the CPU and the GPU by introducing
pauses in the execution of the program.
In this chapter, for readers familiar with buffer objects, we are going to explain
what happens in the drivers and then present various methods, including unconven-
tional ones, to transfer data b etween the CPU and the GPU with maximum speed.
If an application needs to transfer meshes or textures frequently and efficiently, these
methods can be used to improve its performance. In this chapter, we will be using
OpenGL 3.3, which is the Direct3D 10 equivalent.
28.1.1 Explanation of Terms
First, in order to match the OpenGL specification, we refer to the GPU as the device.
Second, when calling OpenGL functions, the drivers translate calls into com-
mands and add them into an internal queue on the CPU side. These commands are
then consumed by the device asynchronously. This queue has already been refered to
as the command queue, but in order to be clear, we refer to it as the device command
queue.
Data transfers from CPU memory to device memory will be consistently referred
to as uploading and transfers from the device memory to CPU memory as download-
ing. This matches the client/server paradigm of OpenGL.
Finally, pinned memory is a portion of the main RAM that can be directly used
by the device through the PCI express bus (PCI-e). This is also known as page-locked
memory.
28.2 Buffer Objects
There are many buffer-object targets. The most well-known are GL ARRAY BUFFER
for vertex attributes and GL
ELEMENT ARRAY BUFFER for vertex indices, formerly
known as vertex buffer objects (VBOs). However, there are also GL
PIXEL PACK
BUFFER and GL TRANSFORM FEEDBACK BUFFER and many other useful ones. As
all these targets relate to the same kind of objects, they are all equivalent from a
transfer point of view. Thus, everything we will describe in this chapter is valid for
any buffer object target.
Buffer objects are linear memory regions allocated in device memory or in CPU
memory. They can be used in many ways, such as
the source of vertex data,
texture buffer, which allows shaders to access large linear memory regions
(128–256 MTexels on GeForce 400 series and Radeon HD 5000 series)
[ARB 09a],

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required