Chapter 1. HTML, XHTML, and the World Wide Web

Though it began as a military experiment and spent its adolescence as a sandbox for academics and eccentrics, in less than a decade the worldwide network of computer networks -- also known as the Internet — has matured into a highly diversified, financially important community of computer users and information vendors. From the boardroom to your living room, you can bump into Internet users of nearly any and all nationalities, of any and all persuasions, from serious to frivolous individuals, from businesses to nonprofit organizations, and from born-again Christian evangelists to pornographers.

In many ways, the Web — the open community of hypertext-enabled document servers and readers on the Internet — is responsible for the meteoric rise in the network’s popularity. You, too, can become a valued member by contributing: writing HTML and XHTML documents and then making them available to web surfers worldwide.

Let’s climb up the Internet family tree to gain some deeper insight into its magnificence, not only as an exercise of curiosity, but to help us better understand just who and what it is we are dealing with when we go online.

The Internet

Although popular media accounts are often confused and confusing, the concept of the Internet really is rather simple: it’s a worldwide collection of computer networks — a network of networks — sharing digital information via a common set of networking and software protocols.

Networks are not new to computers. What makes the Internet unique is its worldwide collection of digital telecommunication links that share a common set of computer-network technologies, protocols, and applications. Whether you run Microsoft Windows XP, Linux, Mac OS X, or even the now ancient Windows 3.1, when connected to the Internet, computers all speak the same networking language and use functionally identical programs, so you can exchange information — even multimedia pictures and sound — with someone next door or across the planet.

The common and now quite familiar programs people use to communicate and distribute their work over the Internet have also found their way into private and semi-private networks. These so-called intranets and extranets use the same software, applications, and networking protocols as the Internet. But unlike the Internet, intranets are private networks, with access restricted to members of the institution. Likewise, extranets restrict access but use the Internet to provide services to members.

The Internet, on the other hand, seemingly has no restrictions. Anyone with a computer and the right networking software and connection can “get on the Net” and begin exchanging words, sounds, and pictures with others around the world, day or night: no membership required. And that’s precisely what is confusing about the Internet.

Like an oriental bazaar, the Internet is not well organized, there are few content guides, and it can take a lot of time and technical expertise to tap its full potential. That’s because . . .

In the Beginning

The Internet began in the late 1960s as an experiment in the design of robust computer networks. The goal was to construct a network of computers that could withstand the loss of several machines without compromising the ability of the remaining ones to communicate. Funding came from the U.S. Department of Defense, which had a vested interest in building information networks that could withstand nuclear attack.

The resulting network was a marvelous technical success, but it was limited in size and scope. For the most part, only defense contractors and academic institutions could gain access to what was then known as the ARPAnet (Advanced Research Projects Agency Network of the Department of Defense).

With the advent of high-speed modems for digital communication over common phone lines, some individuals and organizations not directly tied to the main digital pipelines began connecting and taking advantage of the network’s advanced and global communications. Nonetheless, it wasn’t until the last decade (around 1993, actually) that the Internet really took off.

Several crucial events led to the meteoric rise in popularity of the Internet. First, in the early 1990s, businesses and individuals eager to take advantage of the ease and power of global digital communications finally pressured the largest computer networks on the mostly U.S. government-funded Internet to open their systems for nearly unrestricted traffic. (Remember, the network wasn’t designed to route information based on content — meaning that commercial messages went through university computers that at the time forbade such activity.)

True to their academic traditions of free exchange and sharing, many of the original Internet members continued to make substantial portions of their electronic collections of documents and software available to the newcomers — free for the taking! Global communications, a wealth of free software and information: who could resist?

Well, frankly, the Internet was a tough row to hoe back then. Getting connected and using the various software tools, if they were even available for their computers, presented an insurmountable technology barrier for most people. And most available information was plain-vanilla ASCII text about academic subjects, not the neatly packaged fare that attracts users to services such as America Online. The Internet was just too disorganized, and, outside of the government and academia, few people had the knowledge or interest to learn how to use the arcane software or the time to spend rummaging through documents looking for ones of interest.

HTML and the Web

It took another spark to light the Internet rocket. At about the same time the Internet opened up for business, some physicists at CERN, the European Particle Physics Laboratory, released an authoring language and distribution system they developed for creating and sharing multimedia-enabled, integrated electronic documents over the Internet. And so was born Hypertext Markup Language (HTML), browser software, and the Web. No longer did authors have to distribute their work as fragmented collections of pictures, sounds, and text. HTML unified those elements. Moreover, the Web’s systems enabled hypertext linking, whereby documents automatically reference other documents located anywhere around the world: less rummaging, more productive time online.

Lift-off happened when some bright students and faculty at the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Urbana-Champaign wrote a web browser called Mosaic. Although designed primarily for viewing HTML documents, the software also had built-in tools to access the much more prolific resources on the Internet, such as FTP archives of software and Gopher-organized collections of documents.

With versions based on easy-to-use graphical user interfaces familiar to most computer owners, Mosaic became an instant success. It, like most Internet software, was available on the Net for free. Millions of users snatched up copies and began surfing the Internet for “cool web pages.”

Golden Threads

There you have the history of the Internet and the Web in a nutshell: from rags to riches in just a few short years. The Internet has spawned an entirely new medium for worldwide information exchange and commerce. For instance, when the marketers caught on to the fact that they could cheaply produce and deliver eye-catching, wow-and-whizbang commercials and product catalogs to those millions of web surfers around the world, there was no stopping the stampede of blue suede shoes. Even the key developers of Mosaic and related web server technologies sensed potential riches. They left NCSA and made their fortunes with Netscape Communications by producing commercial web browsers and server software. That was until the sleeping giant Microsoft awoke. But that’s another story . . .

Business users and marketing opportunities have helped invigorate the Internet and fuel its phenomenal growth. Internet-based commerce has become Very Big Business and is expected to approach US$150 billion annually by 2005.

For some, particularly us Internet old-timers, business and marketing have also trashed the medium. In many ways, the Web has become a vast strip mall and an annoying advertising medium. Believe it or not, once upon a time, Internet users adhered to commonly held (but not formally codified) rules of netiquette that prohibited such things as “spamming” special-interest newsgroups with messages unrelated to the topic at hand or sending unsolicited email.

Nonetheless, the power of HTML and network distribution of information goes well beyond marketing and monetary rewards: serious informational pursuits also benefit. Publications, complete with images and other media like executable software, can get to their intended audiences in the blink of an eye, instead of the months traditionally required for printing and mail delivery. Education takes a great leap forward when students gain access to the great libraries of the world. And at times of leisure, the interactive capabilities of HTML links can reinvigorate our otherwise television-numbed minds.

Get HTML & XHTML: The Definitive Guide, 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.