Algorithms and Data Structures for Massive Datasets
by Dzejla Medjedovic, Emin Tahirovic, Ines Schweigert
Part 1 Hash-based sketches
In the next few chapters, we will explore probabilistic succinct data structures. We will see how bread-and-butter problems in the world of regular algorithms, such as frequency estimation, membership queries, and the count-distinct problem, become harder to tackle as the amount of data grows and classical data structures start to spill out of RAM. We turn our attention to a collection of data structures that help solve the same problems, only with much less space. What’s the catch? These data structures will not always give you 100% accuracy. The good news is that the error rates are often low and are greatly compensated for by major wins in data structure storage. The data structures exhibited in part 1 include Bloom ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access