Chapter 12

Streaming Algorithms for Big Data Processing on Multicore Architecture

Marat Zhanikeev


This chapter brings together three topics: hash functions, Bloom filters, and the recently emerged streaming algorithms. Hashing is the oldest of the three and is backed by much literature. Bloom filters are based on hash functions and benefit from hashing efficiency directly. Streaming algorithms use both Bloom filters and hashing in various ways but impose strict requirements on performance. This chapter views the three topics from the viewpoint of efficiency and speed. The two main performance metrics are per-unit processing time and the size of the memory footprint. All algorithms are presented as C/C++ pseudocode. Specific attention ...

Get Big Data now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.