The k-means algorithm [Mac67], also called Lloyd’s algorithm [Llo82], is a way of finding clusters in a dataset. It is popular because it is simple to implement. Parallelizing it with the map (Chapter 4) and reduce (Section 5.1) patterns is straightforward. However, a little cleverness can reduce the number of synchronizations by manipulating the code so that the map and reduce patterns can be fused. The parallel k-means implementation involves a reduction over a “heavy” type. The Cilk Plus implementation illustrates the mechanics of defining reducer hyperobjects for such reductions, when the predefined reducers do not suffice. The TBB implementation shows how to use thread-local storage in TBB.