Chapter 10. The Quest for an Accelerated Population Count

Henry S. Warren, Jr.

A fundamental computer algorithm, and a deceptively simple one, is the population count or sideways sum, which calculates the number of bits in a computer word that are 1. The population count function has applications that range from the very simple to the quite sublime. For example, if sets are represented by bit strings, population count gives the size of the set. It can also be used to generate binomially distributed random integers. These and other applications are discussed at the end of this chapter.

Although uses of this operation are not terribly common, many computers—often the supercomputers of their day—had an instruction for it. These included the Ferranti Mark I (1951), the IBM Stretch computer (1960), the CDC 6600 (1964), the Russian-built BESM-6 (1967), the Cray 1 (1976), the Sun SPARCv9 (1994), and the IBM Power 5 (2004).

This chapter discusses how to compute the population count function on a machine that does not have that instruction, but which has the fundamental instructions generally found on a RISC or CISC computer: shift, add, and, load, conditional branch, and so forth. For illustration, we assume the computer has a 32-bit word size, but most of the techniques discussed here can be easily adapted to other word sizes.

Two problems in population counting are addressed: counting the 1-bits in a single computer word, and counting the 1-bits in a large number of words, perhaps arranged ...

Get Beautiful Code now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.