Game Engine Gems 2

 251

High‐PerformanceProgramming

withData‐OrientedDesign

Noel Llopis

Snappy Touch

Common programming wisdom used to encourage delaying optimizations until

later in the project, and then optimizing only those parts that were obvious bot-

tlenecks in the profiler. That approach worked well with glaring inefficiencies,

like particularly slow algorithms or code that is called many times per frame. In a

time when CPU clock cycles were a good indication of performance, that was a

good approach to follow. Things have changed a lot in today’s hardware, and we

have all experienced the situation where, after fixing the obvious culprits, no sin-

gle function stands out in the profiler but performance remains subpar. Data-

oriented design helps address this problem by architecting the game with

memory accesses and parallelization from the beginning.

15.1ModernHardware

Modern hardware can be characterized by having multiple execution cores and

deep memory hierarchies. The reason for the complex memory hierarchies is due

to the gap between CPU power and memory access times. Gone are the days

when CPU instructions took about the same time as a main memory access. In-

stead, this gap continues to increase and shows no signs of stopping (see Fig-

ure 15.1).

Different parts of the memory hierarchy have different access times. The

smaller ones closer to the CPU are the fastest ones, whereas main memory can be

really large, but also very slow. Table 15.1 lists some common access times for

different levels of the hierarchy on modern platforms.

252 15.High‐PerformanceProgrammingwithData‐OrientedDesign

Figure 15.1. Relative CPU and memory performance over time.

With these kinds of access times, it’s very likely that the CPU is going to

stall waiting to read data from memory. All of a sudden, performance is not de-

termined so much by how efficient the program executing on the CPU is, but

how efficiently it uses memory.

Barring a radical technology change, this is not a situation that’s about to

change anytime soon. We’ll continue getting more powerful, wider CPUs and

larger memories that are going to make memory access even more problematic in

the future.

Looking at code from a memory access point of view, the worst-case situa-

tion would be a program accessing heterogeneous trees of data scattered all over

memory, executing different code at each node. There we get not just the con-

stant data cache misses but also bad instruction cache utilization because it’s call-

ing different functions. Does that sound like a familiar situation? That’s how

most modern games are architected: large trees of different kinds of objects with

polymorphic behavior.

What’s even worse is that bad memory access patterns will bring a program

down to its metaphorical knees, but that’s not a problem that’s likely to appear

anywhere in the profiler. Instead, it will result in the common situation of every-

thing being slower than we expected, but us not being able to point to a particular

spot. That’s because there isn’t a single place that we can fix. Instead, we need to

change the whole architecture, preferably from the beginning, and use a data-

oriented approach.

100

1000

10,000

1980 1985 1990 1995 2000 2005

Relative performance

Gap

Get Game Engine Gems 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Game Engine Gems 2 by Eric Lengyel

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly