Fetching Data via HTTP

The Python module httplib defines a class for fetching data via HTTP. As is typical with Python, only a few lines of code are needed to fetch a document via HTTP. Let’s experiment with it from an interactive Python session.

First, import the Python module and instantiate the HTTP class. The HTTP class requires the name of the server you wish to connect to. Let’s connect to the Python home page:

>>> import httplib
>>> http=httplib.HTTP('www.python.org')
>>>

Now you need to tell the remote server the data to retrieve and the data formats to accept. Ask the server to fetch the main index page and designate whether to accept plain text or HTML text:

>>> http.putrequest('GET', '/index.html')
>>> http.putheader('Accept', 'text/html')
>>> http.putheader('Accept', 'text/plain')
>>> http.endheaders()
>>>

All that remains is to ask for the data. The getreply() method does this, and returns three items: the error code, the error message, and the headers sent by the server. Make this call and print the result:

>>> errcode, errmsg, headers = http.getreply()
>>> print errcode, errmsg, headers
200 OK <mimetools.Message instance at 1073680>
>>>

HTTP defines the code 200 as success, and it’s reflected in the error message. The headers object retrieved is an instance of another Python class. This Python class can be used in the same way as a Python dictionary, so let’s see what it contains:

>>> len(headers)
8

There are eight headers from the server. You can loop and print them all, ...

Get Python Programming On Win32 now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.