i
i
i
i
i
i
i
i
Chapter 3
The Graphics
Processing Unit
“The display is the computer.”
—Jen-Hsun Huang
Historically, hardware graphics acceleration has started at the end of the
pipeline, first performing rasterization of a triangle’s scanlines. Successive
generations of hardware have then worked back up the pipeline, to the point
where some higher level application-stage algorithms are being committed
to the hardware accelerator. Dedicated hardware’s only advantage over
software is speed, but speed is critical.
Over the past decade, graphics hardware has undergone an incredible
transformation. The first consumer graphics chip to include hardware ver-
tex processing (NVIDIA’s GeForce256) shipped in 1999. NVIDIA coined
the term graphics processing unit (GPU) to differentiate the GeForce 256
from the previously available rasterization-only chips, and it stuck [898].
Over the next few years, the GPU evolved from configurable implementa-
tions of a complex fixed-function pipeline to highly programmable “blank
slates” where developers could implement their own algorithms. Pro-
grammable shaders of various kinds are the primary means by which the
GPU is controlled. The vertex shader enables various operations (includ-
ing transformations and deformations) to be performed on each vertex.
Similarly, the pixel shader processes individual pixels, allowing complex
shading equations to be evaluated per pixel. The geometry shader allows
the GPU to create and destroy geometric primitives (points, lines, trian-
gles) on the fly. Computed values can be written to multiple high-precision
buffers and reused as vertex or texture data. For efficiency, some parts
of the pipeline remain configurable, not programmable, but the trend is
towards programmability and flexibility [123].
29
i
i
i
i
i
i
i
i
30 3. The Graphics Processing Unit
Clipping
Vertex
Shader
Screen
Mapping
Merger
Triangle
Setup
Triangle
Traversal
Pixel
Shader
Geometry
Shader
Figure 3.1. GPU implementation of the rendering pipeline. The stages are color coded
according to the degree of user control over their operation. Green stages are fully
programmable. Yellow stages are configurable but not programmable, e.g., the clipping
stage can optionally perform culling or add user-defined clipping planes. Blue stages are
completely fixed in their function.
3.1 GPU Pipeline Overview
The GPU implements the geometry and rasterization conceptual pipeline
stages described in Chapter 2. These are divided into several hardware
stages with varying degrees of configurability or programmability. Fig-
ure 3.1 shows the various stages color coded according to how programmable
or configurable they are. Note that these physical stages are split up slightly
differently than the functional stages presented in Chapter 2.
The vertex shader is a fully programmable stage that is typically used
to implement the Model and View Transform,” “Vertex Shading,” and
“Projection” functional stages. The geometry shader is an optional, fully
programmable stage that operates on the vertices of a primitive (point, line
or triangle). It can be used to perform per-primitive shading operations,
to destroy primitives, or to create new ones. The clipping, screen mapping,
triangle setup, and triangle traversal stages are fixed-function stages that
implement the functional stages of the same names. Like the vertex and
geometry shaders, the pixel shader is fully programmable and performs the
“Pixel Shading” function stage. Finally, the merger stage is somewhere be-
tween the full programmability of the shader stages and the fixed operation
of the other stages. Although it is not programmable, it is highly config-
urable and can be set to perform a wide variety of operations. Of course,
it implements the “Merging” functional stage, in charge of modifying the
color, Z-buffer, blend, stencil, and other related buffers.
Over time, the GPU pipeline has evolved away from hard-coded op-
eration and toward increasing flexibility and control. The introduction
of programmable shader stages was the most important step in this evo-
lution. The next section describes the features common to the various
programmable stages.
3.2 The Programmable Shader Stage
Modern shader stages (i.e., those that support Shader Model 4.0, DirectX
10 and later, on Vista) use a common-shader core. This means that the

Get Real-Time Rendering, Third Edition, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.