Chapter 7. Introduction to Bayesian Filtering

One particularly powerful scheme for fending off spam takes a statistical approach. Instead of looking for individual elements of a message, this approach examines the message’s entire contents and tries to infer mathematically whether the message is spam or not.

In August of 2002, famous anti-spam crusader Paul Graham wrote “A Plan for Spam” [1], which relates his experiences and frustrations trying to write individual spam detection rules and his epiphany that a statistical approach might be better.

In hindsight, this approach makes good sense. For instance, Graham identified a word used only in his own field (“Lisp”) that, for him, was a powerful discriminator in detecting spam (and legitimate ...

Get Slamming Spam: A Guide for System Administrators now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.