Connecting with HTTP
While urllib
is
suitable for working with Internet files, you may still have the need to
perform more intricate communication with an HTTP server. For example,
if you are writing a Python program to communicate between two web
sites, you may need to adjust the headers to include any cookies the
site may require. You may need to emulate a certain browser type (by
placing its name in your User-Agent
header) if the site requires the latest version of Internet Explorer.
Working with httplib
as opposed to
urllib
in cases such as these allows
for finer control.
HTTP Conversations
HTTP conversations between browsers and servers involve headers and data. The interaction between a web browser and a web server reveals a great deal of information about both parties. The HTTP headers that precede content from the server and precede requests from the browser contain a lot of metadata about both client and server. For example, when you type a URL into your browser and press return, a complete HTTP request is sent to the remote server that can look something like this:
GET /c7/favquote.cgi HTTP/1.1 Host: www.python.org Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90) Connection: Keep-Alive
The headers tell the web server a great deal about the
capabilities of the client browser. From the first line of the headers
(GET /c7/favquote.cgi HTTP/1.1 ...
Get Python & XML now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.