O'Reilly logo

SpamAssassin by Alistair McDonald

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Statistical Tests

Various statistical techniques can be used to identify spam. These generally involve a training phase, where a database of spam and ham emails is taught to the filter or passed through it to identify typical characteristics of spam and ham. This allows future emails to be identified based on the learning from past emails. The various statistical techniques vary in their choice of tokens and the algorithms they use to predict whether an email is spam or ham. The tokens used are normally words, but can include email headers, HTML markup within emails, and other characters such as punctuation marks.

Statistical filters rely on regular training. They use the knowledge gained in training to estimate the probability that new emails ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required