Reduction
And minimizing divergence
Abstract
This chapter introduces the parallel reduction pattern that plays an important role in many data-processing applications. Reduction operators that are associative and commutative allow the reduction computation to be parallelized into a reduction tree and optimized aggressively with several optimization techniques, such as thread index assignment for reduced control and memory divergence, using shared memory for reduced global memory accesses, thread coarsening, and segmented reduction, that are needed to achieve high performance for large inputs.
Keywords
Reduction trees; associative operators; commutative operators; identity value; control divergence; memory coalescing; memory divergence; ...
Get Programming Massively Parallel Processors, 4th Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.