Spam mail detection various machine learning methods and their comparisons

Harendra Singh Negi
Aditya Bhatt
Vandana Rawat

Abstract

The application of machine learning techniques for spam mail detection is explored in this chapter. Utilizing both the CountVectorizer and TF-IDF vectorizer techniques, five algorithms were created: naive Bayes, decision tree, random forest (RF), support vector machine (SVM), and XGBoost. Performance metrics such as AUC-ROC, precision-recall curve, F1 score, recall, accuracy, and precision were utilized to evaluate each method. With an accuracy of 96.67%, RF outperformed the other algorithms while using CountVectorizer. SVM and RF were found to be the top-performing algorithms by using TF-IDF vectorizer, ...

Get Algorithms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.