George T. Heineman

Implementing Bloom Filters in Python

Date: This event took place live on December 01 2015

Presented by: George T. Heineman

Duration: Approximately 90 minutes.

Questions? Please send email to

Description:

A fundamental operation in Information Retrieval is to look up information in a repository by a key value. This operation may be costly since it might access a database or make a network request to fetch the data, which may or may not be present in the repository. One way to improve efficiency is to use a probabilistic data structure known as a Bloom Filter to test whether an element is a member of a set. A Bloom filter can accurately determine when an element does not belong to the set, in which case the costly access operation can be prevented. Even more impressive, it operates in constant time with fixed storage independent of the number of elements in the set. However, Bloom Filters may return a false positive even when the element does not exist; fortunately this false positive rate is predictable.

I will implement a Bloom Filter in Python and demonstrate its predictable behavior using several examples.

About George Heineman

George Heineman is an Associate Professor of Computer Science at WPI. His research interests are in Software Engineering. He co-edited the 2001 book "Component-Based Software Engineering: Putting the Pieces Together". He was the Program Chair for the 2005 International Symposium on Component-Based Software Engineering.


You might also be interested in

Designing Data Structures in Python
By George T. Heineman
September 2015
$119.99 USD
Working with Algorithms in Python
By George T. Heineman
July 2014
$169.99 USD
Algorithms in a Nutshell
By George T. Heineman, Gary Pollice, Stanley Selkow
July 2014
$50.99 USD