URL Access
A URL identifies a resource on the Internet. A URL is a string composed of several optional parts, called components, known as scheme, location, path, query, and fragment. A URL with all its parts looks something like:
scheme://lo.ca.ti.on/pa/th?query#fragment
For example, in http://www.python.org:80/faq.cgi?src=fie, the scheme is http, the location is www.python.org:80, the path is /faq.cgi, the query is src=fie, and there is no fragment. Some of the punctuation characters form a part of one of the components they separate, while others are just separators and are part of no component. Omitting punctuation implies missing components. For example, in mailto:me@you.com, the scheme is mailto, the path is me@you.com, and there is no location, query, or fragment. The missing // means the URL has no location part, the missing ? means it has no query part, and the missing # means it has no fragment part.
The urlparse Module
The urlparse
module supplies functions to analyze and synthesize URL strings. The most frequently used functions of module urlparse
are urljoin
, urlsplit
, and urlunsplit
.
urljoin |
Returns a URL string
|
Get Python in a Nutshell, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.