O'Reilly logo

High Performance Visualization by E. Wes Bethel, Charles Hansen, Hank Childs

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5
Parallel Image Compositing Methods
Tom Peterka
Argonne National Laboratory
Kwan-Liu Ma
University of California at Davis
5.1 Introduction ...................................................... 72
5.2 Basic Concepts and Early Work in Compositing ................ 72
5.2.1 Definition of Image Composition ........................ 73
5.2.2 Fundamental Image Composition Algorithms ........... 74
5.2.3 Image Compositing Hardware ........................... 77
5.3 Recent Advances ................................................. 77
5.3.1 2-3 Swap ................................................. 77
5.3.2 Radix-k ................................................. 78
5.3.3 Optimizations ............................................ 80
5.4 Results ........................................................... 81
5.5 Discussion and Conclusion ....................................... 82
5.5.1 Conclusion ............................................... 83
5.5.2 Directions for Future Research .......................... 85
References .......................................................... 86
Image compositing is a fundamental part of high performance visualization on
large-scale parallel machines. Aside from reading a data set from storage, com-
positing is the most expensive part of the parallel rendering pipeline because
it requires communication among a large number of processes. On a modern
supercomputer, compositing may generate literally hundreds of thousands of
messages. Thus, developing compositing algorithms that scale with growing
machine size is crucial. Such algorithms have enabled, for example, wall-size
images that are tens of megapixels in resolution to be composited at interactive
frame rates from all of the nodes of some of the world’s largest supercomput-
ers and visualization clusters. First, this chapter discusses a history of the
classic parallel image compositing algorithms: direct-send and binary-swap.
From there, the discussion moves to optimizations that have been proposed
over the years, from scheduling to compression and load balancing. Advanced
compositing on modern supercomputing architectures, however, is the main
71
72 High Performance Visualization
focus of this chapter, and in particular, 2-3 swap and radix-k for petascale
HPC machines.
5.1 Introduction
The motivation for studying image compositing is the same as in most
of this book: data sets consisting of trillions of grid points and thousands
of timesteps are being produced by machines with hundreds of thousands of
cores. Machine architectures are growing in size and in complexity, with multi-
dimensional topologies and specialized hardware such as smart network adap-
tors, direct-memory access, and graphics accelerators. Against this backdrop,
scientists in high energy physics, climate, and other domains are demand-
ing more of visualization: real-time, multivariate, time-varying methods that
maximize data locality and minimize data movement.
Image compositing is the final step in sort-last parallel rendering (see 4.2).
In sort-last parallel rendering, each processor generates a finished image of its
subset of data and these images must be combined into one final result.
Legacy image compositing algorithms were invented for much smaller sys-
tems. The history of the direct-send and binary-swap algorithms, began in the
mid-1990s. Optimization features, such as compression, identification of active
pixels, and scheduling, further improved performance. In the early 2000s, pro-
duction compositing libraries implementing many of these features appeared,
some of which are still used today.
New advances in the last five years feature more architecture awareness
of HPC systems. The late 2000s yielded compositing algorithms with higher
degrees of concurrency, scalability, and flexibility in the form of 2-3 swap
and radix-k algorithms. The early 2010s continued optimization at a very
large scale and also continued the implementation of the latest innovations
in production. This chapter highlights some results of these recent advances
with both theoretical and actual performance, and it concludes with future
directions in highly parallel image compositing.
5.2 Basic Concepts and Early Work in Compositing
The previous chapter classified parallel rendering according to when ras-
terized images are sorted [17]: sort-first, sort-middle, and sort-last. One way to
understand the difference in these methods is to identify what is distributed
and what is replicated among the processes. The term process is used to des-
ignate a task executed in parallel with other processes, where processes each
have separate memory address spaces and communicate via passing messages.
In sort-first rendering, the image pixels are usually distributed, and the
data set is typically replicated. HPC applications generate data sets many
times larger than the memory capacity of a single node, so sort-first rendering

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required