Chapter 7. Introduction to Bayesian Filtering
One particularly powerful scheme for fending off spam takes a statistical approach. Instead of looking for individual elements of a message, this approach examines the message’s entire contents and tries to infer mathematically whether the message is spam or not.
In August of 2002, famous anti-spam crusader Paul Graham wrote “A Plan for Spam” , which relates his experiences and frustrations trying to write individual spam detection rules and his epiphany that a statistical approach might be better.
In hindsight, this approach makes good sense. For instance, Graham identified a word used only in his own field (“Lisp”) that, for him, was a powerful discriminator in detecting spam (and legitimate ...