O'Reilly logo

Machine Learning in Java by Boštjan Kaluža

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

E-mail spam detection

Spam or electronic spam refers to unsolicited messages, typically carrying advertising content, infected attachments, links to phishing or malware sites, and so on. While the most widely recognized form of spam is e-mail spam, spam abuses appear in other media as well: website comments, instant messaging, Internet forums, blogs, online ads, and so on.

In this chapter, we will discuss how to build naive Bayesian spam filtering, using bag-of-words representation to identify spam e-mails. The naive Bayes spam filtering is one of the basic techniques that was implemented in the first commercial spam filters; for instance, Mozilla Thunderbird mail client uses native implementation of such filtering. While the example in this chapter ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required