Chapter 7. Introduction to Bayesian Filtering

One particularly powerful scheme for fending off spam takes a statistical approach. Instead of looking for individual elements of a message, this approach examines the message’s entire contents and tries to infer mathematically whether the message is spam or not.

In August of 2002, famous anti-spam crusader Paul Graham wrote “A Plan for Spam” [1], which relates his experiences and frustrations trying to write individual spam detection rules and his epiphany that a statistical approach might be better.

In hindsight, this approach makes good sense. For instance, Graham identified a word used only in his own field (“Lisp”) that, for him, was a powerful discriminator in detecting spam (and legitimate ...

Get Slamming Spam: A Guide for System Administrators now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.