September 2018
Intermediate to advanced
426 pages
10h 46m
English
In this example we can see how extract emails using urllib2 and regular expressions.
You can find the following code in the get_emails_from_url.py file:
import urllib2import re#enter urlweb = raw_input("Enter url: ")#https://www.packtpub.com/books/info/packt/terms-and-conditions#get response form urlresponse = urllib2.Request('http://'+web)#get content page from responsecontent = urllib2.urlopen(response).read()#regular expressionpattern = re.compile("[-a-zA-Z0-9._]+@[-a-zA-Z0-9_]+.[a-zA-Z0-9_.]+")#get mails from regular expressionmails = re.findall(pattern,content)print(mails)
In this screen capture, we can see the script in execution for the packtpub.com domain: