import email
from BeautifulSoup import BeautifulSoup
class EmailObject:
def __init__(self, filepath, category = None):
self.filepath = filepath
self.category = category
self.mail = email.message_from_file(self.filepath)
def subject(self):
return self.mail.get('Subject')
def body(self):
return self.mail.get_payload(decode=True)
BeautifulSoup is a library that parses HTML and XML.
Now that we have captured the case of plaintext, we need to solve the case of HTML.
For that, we want to capture only the
inner_text. But first we need a test case, which
looks something like this:
import unittest
import io
import re
from BeautifulSoup import BeautifulSoup ...