URL Syntax

URLs provide a means of locating any resource on the Internet, but these resources can be accessed by different schemes (e.g., HTTP, FTP, SMTP), and URL syntax varies from scheme to scheme.

Does this mean that each different URL scheme has a radically different syntax? In practice, no. Most URLs adhere to a general URL syntax, and there is significant overlap in the style and syntax between different URL schemes.

Most URL schemes base their URL syntax on this nine-part general format:

<scheme>://<user>:<password>@<host>:<port>/<path>;<params>?<query>#<frag>

Almost no URLs contain all these components. The three most important parts of a URL are the scheme, the host, and the path. Table 2-1 summarizes the various components.

Table 2-1. General URL components

Component

Description

Default value

scheme

Which protocol to use when accessing a server to get a resource.

None

user

The username some schemes require to access a resource.

anonymous

password

The password that may be included after the username, separated by a colon (:).

<Email address>

host

The hostname or dotted IP address of the server hosting the resource.

None

port

The port number on which the server hosting the resource is listening. Many schemes have default port numbers (the default port number for HTTP is 80).

Scheme-specific

path

The local name for the resource on the server, separated from the previous URL components by a slash (/). The syntax of the path component is server- and ...

Get HTTP: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.