Chapter 10. Hashing

In this chapter, I define MyBetterMap, a better implementation of the Map interface than MyLinearMap, and introduce hashing, which makes MyBetterMap more efficient.

Hashing

To improve the performance of MyLinearMap, we’ll write a new class, called MyBetterMap, that contains a collection of MyLinearMap objects. It divides the keys among the embedded maps, so the number of entries in each map is smaller, which speeds up findEntry and the methods that depend on it.

Here’s the beginning of the class definition:

public class MyBetterMap<K, V> implements Map<K, V> {
    
    protected List<MyLinearMap<K, V>> maps;
    
    public MyBetterMap(int k) {
        makeMaps(k);
    }

    protected void makeMaps(int k) {
        maps = new ArrayList<MyLinearMap<K, V>>(k);
        for (int i=0; i<k; i++) {
            maps.add(new MyLinearMap<K, V>());
        }
    }
}

The instance variable, maps, is a collection of MyLinearMap objects. The constructor takes a parameter, k, that determines how many maps to use, at least initially. Then makeMaps creates the embedded maps and stores them in an ArrayList.

Now, the key to making this work is that we need some way to look at a key and decide which of the embedded maps it should go into. When we put a new key, we choose one of the maps; when we get the same key, we have to remember where we put it.

One possibility is to choose one of the sub-maps at random and keep track of where we put each key. But how should we keep track? It might seem like we could use a Map to look up the key and find the right sub-map, ...

Get Think Data Structures now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.