The Uniform Hashing Model


Suppose the m keys k0, k1, · · · , km−1 are inserted into the hash table with a capacity for n keys. The descriptor uniform hashing (UH) is applied to a collision resolution procedure (CRP) that results in hash-table insertions in such a way that all c12ue001 subsets of occupied hash-table addresses are equally likely to occur. We evaluate the efficiency of SEARCH and INSERT assuming UH.

Although there is no direct realization of UH, Yao [Yao 1985] proved that double hashing, which is defined in Chapter 14, behaves asymptotically (as n, m → ∞) like uniform hashing in the sense that both have the same expected average SEARCH cost.

Here, we use Yao’s lovely idea—the basis of this asymptotic equivalence—to identify the hashing function value h(k) with a permutation π ≡ (π0, π1, · · · , πn−1) of the hash-table addresses 0, 1, · · · , n − 1. The insertion of the key k proceeds by testing sequentially the hash-table addresses π0, π1, · · · , πn−1 until either an unoccupied address or an address containing the key k is encountered. Uniform hashing means that all permutations π are equally likely to occur; that is, c12ue002.

That Yao’s UH-CRP results in the same occupancy as plain vanilla UH is because of the following Lemma:

Lemma 12.1. ...

Get Hashing in Computer Science: Fifty Years of Slicing and Dicing now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.