Referencing Documents: The URL

Every document on the Web has a unique address. (Imagine the chaos if they didn’t.) The document’s address is known as its uniform resource locator (URL).[37]

Several HTML/XHTML tags include a URL attribute value, including hyperlinks, inline images, and forms. All use the same URL syntax to specify the location of a web resource, regardless of the type or content of that resource. That’s why it’s known as a uniform resource locator.

Since they can be used to represent almost any resource on the Internet, URLs come in a variety of flavors. All URLs, however, have the same top-level syntax:

            scheme:scheme_specific_part

The scheme describes the kind of object the URL references; the scheme_specific_part is, well, the part that is peculiar to the specific scheme. The important thing to note is that the scheme is always separated from the scheme_specific_part by a colon, with no intervening spaces.

Writing a URL

Write URLs using the displayable characters in the US-ASCII character set. For example, surely you have heard what has become annoyingly common on the radio for an announced business web site: “h, t, t, p, colon, slash, slash, w, w, w, dot, blah-blah, dot, com.” That’s a simple URL, written:

http://www.blah-blah.com

If you need to use a character in a URL that is not part of this character set, you must encode the character using a special notation. The encoding notation replaces the desired character with three characters: a percent sign and two ...

Get HTML & XHTML: The Definitive Guide, 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.