Chapter 9

Parallel patterns—parallel histogram computation

An introduction to atomic operations and privatization


This chapter introduces the parallel histogram computation pattern and the concept of atomic operations. It shows that atomic operations to the same location are serialized and their throughput is determined by their latency. It further introduces four important optimization techniques: interleaved data partitioning for improved memory coalescing, caching for reduced latency and improved throughput of atomic operations, privatization for reduced contention, and aggregation for reduced contention.


Histogram; feature extraction; output interference; race condition; atomic operation; read-modify-write; memory bound; memory ...

Get Programming Massively Parallel Processors, 3rd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.