Before we present the actual dataset, here are a few real-world spam samples:
Here is an example of regular or wanted mail, also known as ham:
The following is a glimpse into the actual dataset used in our spam-ham classification task. There are two datasets:
- inbox.txt: A ham dataset compiled from a small collection of regular emails from my Inbox folder
- junk.txt: A spam dataset compiled from a small collection of junk email from my ...