4 Frequency estimation and count-min sketch

This chapter covers

  • Exploring practical use cases where frequency estimates arise and how count-min sketch can help
  • Learning how count-min sketch works
  • Exploring use cases of a sensor and an NLP app
  • Learning about the error versus space tradeoff in count-min sketch
  • Understanding dyadic ranges and how range queries can be solved with count-min sketch

Popularity analysis, such as producing a bestseller list on an e-commerce site, computing top-k trending queries on a search engine, or reporting frequent source-destination IP address pairs on a network, is a common problem in today’s data-intensive applications. Anomaly detection (i.e., monitoring changes in systems that are awake 24/7, such as sensor ...

Get Algorithms and Data Structures for Massive Datasets now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.