O'Reilly logo

OpenGL Insights by Christophe Riccio, Patrick Cozzi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Indexing Multiple
Vertex Arrays
Arnaud Masserann
26.1 Introduction
One of OpenGLs features is vertex buffer object (VBO) indexing, which allows de-
velopers to reuse a single vertex in several primitives. Since vertex attributes dont
need to be duplicated, indexing saves memory and bandwidth. Given that the GPU
is often memory-bound, most of the time we can get extra speed with indexing.
Indexing requires having a single index for positions, texture coordinates, nor-
mals, and so on. Unfortunately, this is not how many 3D file formats work: for
instance, COLLADA has different indices for each vertex attribute. This is problem-
atic in asset pipelines, where models can come from a variety of sources.
This chapter shows a simple algorithm that transforms several attribute buffers,
each using different indices, into a format that is directly usable by OpenGL. For
applications that do not use indexing, this chapter pro vides a simple way to improve
run-time performance. In practice, speedups of about 1.4 times can be expected, and
this format opens possibilities for further optimizations.
26.2 The Problem
With nonindexed VBOs (see Figure 26.1), we need to specify all attributes for each
vertex: position, color, and all needed UV coordinates, normals, tangents, bitan-
gents, etc.
365
26
366 IV Performance
(x
2
, y
2
, z
2
)
(x
1
, y
1
, z
1
)
(x
0
, y
0
, z
0
)
Vertex Array Buffer:
(x
0
, y
0
, z
0
), (x
1
, y
1
, z
1
), (x
2
, y
2
, z
2
),
(x
2
, y
2
, z
2
), (x
1
, y
1
, z
1
), (x
3
, y
3
, z
3
)
Normal Array Buffer:
(x
0
, y
0
, z
0
), (x
1
, y
1
, z
1
), (x
2
, y
2
, z
2
),
(x
2
, y
2
, z
2
), (x
1
, y
1
, z
1
), (x
3
, y
3
, z
3
)
(x
3
, y
3
, z
3
)
Figure 26.1. AnonindexedVBO.
Nonindexed VBOs suffer from two performance penalties. First, on most meshes
this method uses more memory. For instance, on a sphere with 1000 vertices, all
vertices are shared by three triangles. A nonindexed VBO with GL
FLOAT attributes
for positions, UVs, and normals will take 3 × 1,000 × (3 × 4 + 2 × 4 + 3 × 4) =
3 × 1,000 × 32 = 96, 000 bytes. A similar, indexed VBO will take 96,000/3bytes,
plus 3×1 ,000×4 = 12,000 for the index buffer, totaling 44,000 bytes. In this ideal
case, the indexed VBO only takes 45% of the size of the nonindexed VBO. Indexing
thus reduces both the memory footprint and the PCI-e transfers.
The second per formance penalty comes from the difference in cache usage. There
are two kinds of vertex caches:
AMD GPUs have a pretransform vertex cache that contains a part of the raw
VBO. This cache is used to feed the vertex shader.
The post-transform cache is used to store the ouput variables of the vertex
shader. This is useful because most of the time, a vertex is used by several
triangles. The cache avoids the cost of re-executing the same computations for
each vertex shared by several triangles. However, it uses the index of the ver tex
as a key, so if primitives are drawn without indexing, the cache has no effect.
There are two consequences. First, simple indexing will natively improve perfor-
mance. Second, the use of both of these caches can be optimized:
If the element buffer contains indices to vertices that have a good spatial local-
ity, the pretransform cache will make a large number of hits. In other words,
indices 0-1-2 are better than 0-50-99.
If neighboring triangles are drawn consecutively, most of the used ver tices
will be in the post-transform cache, available for immediate reuse. A num-
ber of algorithms can be found in the literature for reorganizing the indices
in order to get a better post-transform cache usage. In particular, I recom-
mend nvTriStrip, which is slow but ready-to-use, a nd Tom Forsyths algo-
rithm [Forsyth 06], which runs in linear time.
26. Indexing Multiple Vertex Arrays 367
Vertex Array Buffer:
(x
0
, y
0
, z
0
),
(x
1
, y
1
, z
1
),
(x
2
, y
2
, z
2
),
(x
3
, y
3
, z
3
)
Normal Array Buffer:
(x
0
, y
0
, z
0
),
(x
1
, y
1
, z
1
),
(x
2
, y
2
, z
2
),
(x
3
, y
3
, z
3
)
Element Array Buffer:
0 1 2, 2 1 3
0
1
3
2
Figure 26.2. An indexed VBO.
Figure 26.2 shows what an indexed VBO looks like, along with the associated
attributes. Note that both coordinates and normals are shared for vertices 1 and 2.
For these reasons, indexing is recommended by all major GPU vendors
[NVIDIA 08, Hart 04, Imagination Technologies 09]. However, Figure 26.3 shows
anexcerptoftheCOLLADAexportofasimilarmesh.
Figure 26.3. ACOLLADAmesh.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required