O'Reilly logo

Python in a Nutshell, 2nd Edition by Alex Martelli

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

URL Access

A URL identifies a resource on the Internet. A URL is a string composed of several optional parts, called components, known as scheme, location, path, query, and fragment. A URL with all its parts looks something like:

scheme://lo.ca.ti.on/pa/th?query#fragment

For example, in http://www.python.org:80/faq.cgi?src=fie, the scheme is http, the location is www.python.org:80, the path is /faq.cgi, the query is src=fie, and there is no fragment. Some of the punctuation characters form a part of one of the components they separate, while others are just separators and are part of no component. Omitting punctuation implies missing components. For example, in , the scheme is mailto, the path is , and there is no location, query, or fragment. The missing // means the URL has no location part, the missing ? means it has no query part, and the missing # means it has no fragment part.

The urlparse Module

The urlparse module supplies functions to analyze and synthesize URL strings. The most frequently used functions of module urlparse are urljoin, urlsplit, and urlunsplit.

urljoin

urljoin(base_url_string,relative_url_string)

Returns a URL string u, obtained by joining relative_url_string, which may be relative, with base_url_string. The joining procedure that urljoin performs to obtain its result u may be summarized as follows:

  • When either of the argument strings is empty, u is the other argument.

  • When relative_url_string explicitly specifies a scheme that is different ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required