The Uniform Hashing Model
12.1 AN IDEALIZED HASHING MODEL
Suppose the m keys k0, k1, · · · , km−1 are inserted into the hash table with a capacity for n keys. The descriptor uniform hashing (UH) is applied to a collision resolution procedure (CRP) that results in hash-table insertions in such a way that all subsets of occupied hash-table addresses are equally likely to occur. We evaluate the efficiency of SEARCH and INSERT assuming UH.
Although there is no direct realization of UH, Yao [Yao 1985] proved that double hashing, which is defined in Chapter 14, behaves asymptotically (as n, m → ∞) like uniform hashing in the sense that both have the same expected average SEARCH cost.
Here, we use Yao’s lovely idea—the basis of this asymptotic equivalence—to identify the hashing function value h(k) with a permutation π ≡ (π0, π1, · · · , πn−1) of the hash-table addresses 0, 1, · · · , n − 1. The insertion of the key k proceeds by testing sequentially the hash-table addresses π0, π1, · · · , πn−1 until either an unoccupied address or an address containing the key k is encountered. Uniform hashing means that all permutations π are equally likely to occur; that is, .
That Yao’s UH-CRP results in the same occupancy as plain vanilla UH is because of the following Lemma:
Lemma 12.1. ...