Web servers host web resources . A web resource is the source of web content. The simplest kind of web resource is a static file on the web server’s filesystem. These files can contain anything: they might be text files, HTML files, Microsoft Word files, Adobe Acrobat files, JPEG image files, AVI movie files, or any other format you can think of.
However, resources don’t have to be static files. Resources can also be software programs that generate content on demand. These dynamic content resources can generate content based on your identity, on what information you’ve requested, or on the time of day. They can show you a live image from a camera, or let you trade stocks, search real estate databases, or buy gifts from online stores (see Figure 1-2).
In summary, a resource is any kind of content source. A file containing your company’s sales forecast spreadsheet is a resource. A web gateway to scan your local public library’s shelves is a resource. An Internet search engine is a resource.
Because the Internet hosts many thousands of different data types, HTTP carefully tags each object being transported through the Web with a data format label called a MIME type. MIME (Multipurpose Internet Mail Extensions) was originally designed to solve problems encountered in moving messages between different electronic mail systems. MIME worked so well for email that HTTP adopted it to describe and label its own multimedia content.
Web servers attach a MIME type to all HTTP object data (see Figure 1-3). When a web browser gets an object back from a server, it looks at the associated MIME type to see if it knows how to handle the object. Most browsers can handle hundreds of popular object types: displaying image files, parsing and formatting HTML files, playing audio files through the computer’s speakers, or launching external plug-in software to handle special formats.
A MIME type is a textual label, represented as a primary object type and a specific subtype, separated by a slash. For example:
An HTML-formatted text document would be labeled with type
A plain ASCII text document would be labeled with type
A JPEG version of an image would be
A GIF-format image would be
An Apple QuickTime movie would be
A Microsoft PowerPoint presentation would be
There are hundreds of popular MIME types, and many more experimental or limited-use types. A very thorough MIME type list is provided in Appendix D.
Each web server resource has a name, so clients can point out what resources they are interested in. The server resource name is called a uniform resource identifier , or URI. URIsare like the postal addresses of the Internet, uniquely identifying and locating information resources around the world.
Here’s a URI for an image resource on Joe’s Hardware store’s web server:
Figure 1-4 shows how the URI specifies the HTTP protocol to access the saw-blade GIF resource on Joe’s store’s server. Given the URI, HTTP can retrieve the object. URIs come in two flavors, called URLs and URNs. Let’s take a peek at each of these types of resource identifiers now.
The uniform resource locator (URL) is the most common form of resource identifier. URLs describe the specific location of a resource on a particular server. They tell you exactly how to fetch a resource from a precise, fixed location. Figure 1-4 shows how a URL tells precisely where a resource is located and how to access it. Table 1-1 shows a few examples of URLs.
Table 1-1. Example URLs
The home URL for O’Reilly & Associates, Inc.
The URL for the Yahoo! web site’s logo
The URL for a program that checks if inventory item #12731 is in stock
The URL for the locking-pliers.gif image file, using password-protected FTP as the access protocol
Most URLs follow a standardized format of three main parts:
The second part gives the server Internet address (e.g., www.joes-hardware.com).
The rest names a resource on the web server (e.g., /specials/saw-blade.gif ).
Today, almost every URI is a URL.
The second flavor of URI is the uniform resource name, or URN. A URN serves as a unique name for a particular piece of content, independent of where the resource currently resides. These location-independent URNs allow resources to move from place to place. URNs also allow resources to be accessed by multiple network access protocols while maintaining the same name.
For example, the following URN might be used to name the Internet standards document “RFC 2141” regardless of where it resides (it may even be copied in several places):
URNs are still experimental and not yet widely adopted. To work effectively, URNs need a supporting infrastructure to resolve resource locations; the lack of such an infrastructure has also slowed their adoption. But URNs do hold some exciting promise for the future. We’ll discuss URNs in a bit more detail in Chapter 2, but most of the remainder of this book focuses almost exclusively on URLs.
Unless stated otherwise, we adopt the conventional terminology and use URI and URL interchangeably for the remainder of this book.