Analyzing Your Own Mail Data

The Enron mail data makes for great illustrations in a chapter on mail analysis, but you’ll almost certainly want to take a closer look at your own mail data. Fortunately, many popular mail clients provide an “export to mbox” option, which makes it pretty simple to get your mail data into a format that lends itself to analysis by the techniques described in this chapter. For example, in Apple Mail, you can select some number of messages, pick “Save As…” from the File menu, and then choose “Raw Message Source” as the formatting option to export the messages as an mbox file (see Figure 3-7). A little bit of searching should turn up results for how to do this in most other major clients.

Most mail clients provide an option for exporting your mail data to an mbox archive

Figure 3-7. Most mail clients provide an option for exporting your mail data to an mbox archive

If you exclusively use an online mail client, you could opt to pull your data down into a mail client and export it, but you might prefer to fully automate the creation of an mbox file by pulling the data directly from the server. Just about any online mail service will support POP3 (Post Office Protocol version 3), most also support IMAP (Internet Message Access Protocol), and Python scripts for pulling down your mail aren’t very hard to whip up. One particularly robust command-line tool that you can use to pull mail data from just about anywhere is getmail , which turns out to ...

Get Mining the Social Web now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.