O'Reilly logo

Slamming Spam: A Guide for System Administrators by Robert Haskins, Dale Nielsen

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Word Choice

Earlier Bayesian filter authors made interesting choices about the actual sets of words of a message they chose to analyze. Some, as mentioned earlier, derived word stems. Others processed only the text of the message.

Yerazunis’s CRM114 tool looks at every string of numbers/text (i.e., all punctuation and white space is considered to be a “word” delimiter) of the message. A slight modification to his program decodes attachments. This means that every piece of the header is examined, including timestamps, message identification numbers, and other administrivia in addition to more intuitively pleasing items like the sender’s email address. The training methodologies (see the next section) work to balance items that are found in both ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required