Chapter 10

Reduction

And minimizing divergence

Abstract

This chapter introduces the parallel reduction pattern that plays an important role in many data-processing applications. Reduction operators that are associative and commutative allow the reduction computation to be parallelized into a reduction tree and optimized aggressively with several optimization techniques, such as thread index assignment for reduced control and memory divergence, using shared memory for reduced global memory accesses, thread coarsening, and segmented reduction, that are needed to achieve high performance for large inputs.

Keywords

Reduction trees; associative operators; commutative operators; identity value; control divergence; memory coalescing; memory divergence; ...

Get Programming Massively Parallel Processors, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.