Chapter 1. Web Server Setup

Introduction

The process of designing a web site does not start in Photoshop, Dreamweaver, or your favorite text editor. Before the first line of code is written and the first image is optimized, a web site must have a home on the Internet and a virtual provenance of sorts that legitimizes its existence along with the millions of sites that have come before. Web sites must have a domain name, as well as disk space on a web server, to join the ever-growing club of online resources. In this chapter, we’ll untangle the choices that confront web site builders during the process of getting a new web site off the ground.

1.1. Registering a Domain Name

Problem

You need to register a new domain name for your web site.

Solution

Choose the right domain name and registrar for your web site after weighing factors such as budget and goals for organizational identity.

Discussion

Choosing a new domain name for a web site can often seem like shopping on the last day of an end-of-season sale at a popular clothing store. The best choices were long ago snapped up by the early shoppers. Like a picked-over pile of extra-small beige golf shirts, the remaining choices may not be a perfect fit for your planned web site.

Check the registrar of your choice to see if the domain name you want is already registered. Assuming for a minute that you will not be able to acquire your first choice (or second, or even third), here are some guidelines to consider when registering a brand new domain name:

  • Consider using your company or organization’s short branding message or marketing slogan as your domain name. For example, if Wal-Mart’s deep pockets and legions of lawyers were not able to wrest ownership of walmart.com from a hypothetical cyber squatter, they might consider alwayslowprices.com an acceptable alternative.

  • Try to come up with an action-oriented phrase or common aphorism that dovetails with the mission of your web site, and build your web site around that domain. For example, a site that promotes good health through good diet might register anappleaday.com.

  • Try adding your city, state, or other local identifier to your already-taken first or second choice to find an acceptable alternative that you can claim for your own, such as austinwebdesign.com or youngstownyoga.com.

  • Avoid using hyphens and long acronyms in your domain name. You might be tempted to register an alternative domain name by tweaking your already-taken first choice with hyphens between key words, or by reducing your business name to an alphabet-soup acronym of unrelated letters. Don’t do it. A fair share of your potential site visitors will trip over these grammatical stumbling blocks, leaving your web site lost in cyberspace. For example, wsj.com works; the-wall-street-journal.com does not.

  • Consider registering a domain name from the ever-growing list of new top-level domains (TLDs)(see Table 1-1).

Table 1-1. Top-level domains: past, present, and future

Name

Description

Original TLDs

 

.com

May be registered by anyone; operated by VeriSign Global Registry. Available since 1995.

.net

May be registered by anyone; operated by VeriSign Global Registry. Available since 1995.

.org

May be registered by anyone; operated by Public Interest Registry. Available since 1995.

.edu

Reserved for U.S. educational institutions, such as universities or high schools. Operated by EDUCAUSE. Available since 1995.

.gov

Reserved for U.S. government use since 1995.

.mil

Reserved for U.S. military use since 1995.

Other sponsored top-level domains (sTLDs)

 

.aero

Sponsored by the Societe Internationale de Telecommunications Aeronautiques SC (SITA) and restricted to organizations within the air transport industry. Available since 2001.

.biz

Operated by NeuLevel, a joint venture between NeuStar, a Virginia-based telecommunications company, and Melbourne IT, an Australian domain name registration service. Must be used by businesses for commercial purposes. Available since 2001.

.coop

Sponsored by Dot Cooperation LLC and the National Cooperative Business Association, based in Washington, D.C. It is restricted to cooperative organizations. Available since 2001.

.info

Operated by Afilias Limited, a consortium of 19 major domain name registrars including VeriSign, Register.com, and Tucows. Anyone may register a .info domain name. Available since 2001.

.int

Registrants must be an intergovernmental organization. Operated by the Internet Assigned Numbers Authority (IANA). Available since 1998.

.museum

Sponsored and administered by the Museum Domain Management Association, a nonprofit organization founded by the International Counsel of Museums and the J. Paul Getty Trust. Restricted to accredited museums worldwide. Available since 2001.

.name

Offered to individuals for personal web sites and email addresses; operated by Global Name Registry. Available since 2001.

.pro

Marketed to professionals, such as accountants, doctors, lawyers, and engineers. Operated exclusively by RegistryPro. Available since 2002.

Proposed, pending, and recently added TLDs

 

.asia

Proposed by DotAsia Organisation Ltd. in early 2004.

.cat

Approved by ICANN in September 2005, but it is not for feline aficionados. From the applicant’s web site: “Why do we want .cat? Because the Catalan language and culture are a community that wants to be identified with its own domain on the internet.” Who’s next? Klingons?

.jobs

Sponsored by the Society for Human Resource Management (SHRM); approved by ICANN in late 2004.

.mail

Proposed by the Spamhaus Project and others as an antidote to spam.

.mobi

Sponsored by Mobi JV, a consortium of Microsoft, Nokia, and Vodafone and other heavy-weight multinational corporations. Geared toward web sites to be viewed on mobile devices, such as PDAs and cell phones. Approved by ICANN in July 2005.

.post

Sponsored by the Switzerland-based Universal Postal Union; approved by ICANN in late 2004.

.tel

ICANN approved Telnic’s application to run a TLO for managing corporate and individual contact information in July 2005.

.travel

Sponsored by the Travel Partnership Corporation; approved by ICANN in late 2004.

.xxx

In August 2005, the Bush administration expressed its opposition to the creation of a new TLD specifically for the porn industry.

Notable country code TLDs (ccTLDs)

 

.md

Originally for use by the eastern European Republic of Moldova; now marketed to physicians.

.tv

Administered by the .tv Corporation, a subsidiary of VeriSign. This TLD hit the free market in 2000 thanks to the South Pacific island nation of Tuvalu.

.us

Commonly used for city, county, and state web sites in the United States, now sold for commercial use to web sites with domains in every other TLD.

.eu

Proposed to ICANN by EURid for use by businesses and individuals in the European Union.

With your domain name chosen, it’s time to claim it as your own by registering it. To make sense of the often complex and overlapping roles of domain name registrars and web site hosting providers, consider this great analogy (related on numerous sites on the Web) that recasts the process as one that should be familiar to car owners everywhere.

Imagine for a minute that your web site is an automobile—say, a red Lexus RX 330—and its domain name is a personalized license plate. To get a license plate for your Lexus, you register it with your local department of motor vehicles; to get a domain name for your web site, you must register it with an accredited domain name registrar. The Internet Corporation for Assigned Names and Numbers (ICANN) keeps a list of accredited registrars on its web site (http://www.icann.org/registrars/accredited-list.html). If the registrar you want to use is not on the list, they are likely a reseller of an approved registrar’s services. Many web hosting companies (more on them later) offer domain name registration in this way.

Most Lexus owners, it’s safe to assume, keep their cars in a garage, be it semi-detached in a suburban subdivision or underground in spot C219. After you have registered your domain name, you have to find a place to keep it. Web hosting companies typically provide domain name service (DNS) and disk space on a web server where web site designers “park” their web sites.

But where the rubber meets the road, so to speak, is where you as the web designer draw the line between registration and web hosting. Many—if not most—registrars can host your site, and a growing number of hosting companies can register your domain name when you sign up for one of their hosting plans. But few Lexus owners would grant the clerks at the DMV the dual role of parking attendant for one of their most valuable assets (their car). For your domain name and web site, you would be wise to abide by the same separation-of-powers principle, choosing one company to handle your domain registration and a different one to do the hosting.

Why? Here are some gotchas to beware of and avoid when choosing a registrar and host for your domain name and web site:

  • Hosting companies that offer to register your domain name have been known to list themselves as the owner of your domain name. Although this practice is less common than it once was, and by no means widespread, clearing up this administrative wrinkle in your DNS record can be a real headache if or when the time comes to relocate your web site to a new hosting provider. Bottom line: if you choose to register and host with the same company, read the fine print in your service agreement.

  • Registrars that offer no-cost, or low-cost, registration in exchange for you also choosing them as your hosting provider may be doing so in hopes of collecting high fees when your domain comes up for renewal. Expect to pay $10 to $35 for a one-year registration. Bottom line: assume that you’re getting what you pay for and make sure you know who’s responsible for what.

See Also

ICANN-accredited registrars are listed at http://www.icann.org/registrars/accredited-list.html, and whois.net is another good domain lookup service, at http://www.whois.net/.

Netcraft (http://netcraft.com) tracks Internet technologies and compiles statistics, including hosting provider network performance and uptime. TopHosts.com (www.tophosts.com) and HostSearch (http://hostsearch.com) let you search for hosting plans based on price, platform, and features.

1.2. Managing and Protecting a Domain Name

Problem

You need to protect the investment you made in a domain name for your web site.

Solution

Learn how the domain registration system works and keep your domain from being neglected or stolen by:

  • Knowing the expiration date

  • Keeping contact information up to date

  • Enabling domain security features

  • Choosing a strong domain management password

  • Registering your domain name as a trademark

  • Reading every email your registrar sends to you carefully

  • Consolidating multiple domains

  • Registering domain name variants

  • Using a domain-name monitoring service

  • Planning ahead if you move your domain to another host

  • Setting up a third-party, backup DNS service

Discussion

When comparing a web site to a car, as in Recipe 1.1, you should take into account one key distinction: cars depreciate in value the more you use them. But once you build a functioning web site at your domain—with growing traffic and name recognition—that domain becomes many times more valuable to you than the nominal fee you paid to register it. For that reason, you should treat your domain name as a valuable asset to your business or organization.

The process of choosing a registrar, name service, and hosting provider can be complicated considering all the overlapping options that a web site builder must sort out before making a decision. But the process of losing a domain name can be deceptively simple—with the emphasis on deceptive—and can happen right under the nose of the careless domain name owner.

Ownership of a domain name can be lost to the fraudulent actions of an aggressive domain name speculator, or simply for want of attention to detail. In most cases, a registered domain that is allowed to expire will be snapped up by a speculator within hours of it becoming available.

Learn from the mistakes of others, including me. A few years ago I was managing a Spanish-language web site for a client with a domain name listed with a registrar based in Spain. Ithought I had all my bases covered early in 2002, well in advance of the expiration date listed in the whois database: 12-01-2002. But by mid-January, the domain was no longer in my control. The expiration date I had assumed to be December 1, 2002, was actually formatted in the European date style of day first, then month, and year. After my registration expired on January 12, the domain was purchased by a speculator who wanted $5,000 to sell it back to my client, a price the client could not afford.

Here are some important techniques for domain management that can prevent the inadvertent loss of your domain name:

Keep track of the expiration date.

Put the date on your calendar and keep a print-out of your whois record—listing the administrative, billing, technical contacts, and expiration date—in your files. Use the buddy system: make sure at least one other person with an interest in protecting the domain knows the expiration date and whois record information. Choose a registration term that won’t exceed the institutional memory of the domain name owners. Although many registrars offer domain-name periods of up to 10 years, I prefer to keep mine two or three years. That way, I get a chance every so often to review the value of the domain and even choose a new registrar if I want—and I was never very good at that “Where will you be in five years?” interview question either.

Make sure the contact information is up to date.

The whois listing should have your correct contact data and list the proper owner: either the administrative contact, billing contact, or both. Make sure that you give your registrar, as well as your hosting company, an email address that’s not @yourdomainname.com in case there are problems with your domain name or account that make your email inoperable.

Enable any and all security features that your registrar offers.

A new ICANN domain transfer policy went into effect in the second half of 2004 that cut the time between a transfer request and its taking place to as little as five days. Basically, it allows anyone to request a domain transfer and the registrar to authorize the transfer if the current owner does not object. When this policy went into effect, one pithy online forum poster quipped that it spelled the end of week-long, internet-free vacations for web builders everywhere. You can prevent this form of hijacking by enabling a registration lock on your domain. Only unlocked domains can fall victim to the new quick transfer procedure, and only the owner of the domain can unlock it. Other security features vary among registrars, so familiarize yourself with those that are available on your domain and use them to protect it.

Choose a complex, hard-to-guess domain management password.

Don’t email this password to anyone. Most good registrars offer web-based tools for managing your domain, so you’ll need to create a secure password for accessing your account. Choose one that contains at least eight characters, including both upper-and lowercase letters and one numeral. Don’t base it on a real word, or any other bit of personal information that could be guessed by other means, such as your birthday, address, or phone number. If you need to give the password to a colleague or web designer, do not send it by email. Email can be intercepted and read by someone you can’t trust en route to its intended recipient. When sharing sensitive passwords, deliver them in person, over the phone, or by fax, provided the receiving fax machine is in a trusted location. If you forget your password and your registrar emails it to you, log in and change the password immediately. Many registrars use a better method of resetting a forgotten password and requiring the domain owner to verify the change by logging in to their account. In either case, don’t forget to make a note of your new password.

Register your domain name as a trademark, if possible.

If your web site address identifies a distinctive product or service that your business provides, then you might consider applying for trademark protection through the U.S. Patent and Trademark Office, and other national trademark offices as necessary. Trademark protection can be a potential weapon on your side, should a dispute over your domain name arise. Bear in mind, however, that the most insidious domain hijackers may re-register your domain in a country that does not have the same high regard for U.S. trademarks as you do. In that case, prepare for a long, costly—and usually fruitless—effort to reclaim your stolen domain. For more resources on handling domain disputes, refer to the organizations listed under “See Also” at the end of this section.

Read every email your registrar sends you carefully and skeptically.

ICANN now requires registrars to contact domain name owners annually to verify contact information. Unscrupulous domain name speculators also will try to contact you with an email that appears to come from your registrar, in an attempt to trick you into providing information to them that they can use to hijack your domain. If you’re unsure about any communication you get regarding your domain, call your registrar or report the fraudulent email to them immediately.

Consolidate multiple domains with one registrar and a common expiration date.

It’s tempting to shop around for the best deal when registering a new domain name, but before you know it you’ve got nearly as many domain management accounts with various registrars as you have domains under your control. The potential to lose one or more of your domains is an order of magnitude greater in this situation. Find the registrar with the best balance of features, prices and management tools that meet your needs, and stick with that registrar. Move domains at other registrars over to your preferred registrar when they’re up for renewal. The little bit of extra money you will spend will be worth it for peace of mind.

Register as many variants of your domain as your budget will allow.

Often, the appearance of a hijacked domain can be more likely and more damaging than the hijacking itself. Take the case of a hypothetical nonprofit that registers only the dot-org (.org) variation of its name. A group with opposing views—or just a penchant for mischief—can set up a web site using the dot-com (.com) domain, leaving visitors who don’t know any better confused about which site truly speaks for the organization.

Use a domain name monitoring service such as SnapNames or NameProtect.

SnapNames and NameProtect allow you to get alerts about potentially unwanted changes to domains you own and registration opportunities for domains you don’t own, but want to own.

Plan ahead if you move your domain to another host.

One of the services your web-hosting company will provide for you is DNS on its domain name servers. The DNS system functions as the address book of the world wide web, matching up the internet protocol (IP) numbers by which network traffic gets routed with alphabetical domain names that are easier for humans to remember. Just as previous residents at your home address may periodically get letters in your mailbox, the DNS system of web site addresses does not get updated instantaneously when you move your site and domain to a new web hosting service. That’s because when you move your site, the domain name remains the same but the IP number associated with it changes, and propagation of the new IP number associated with your domain name to the thousands of DNS servers around the world takes anywhere from 24 to 72 hours.

When moving your site, follow these steps in order or risk your site disappearing from the web temporarily while information about your move spreads throughout the DNS system:

  1. Set up your new hosting account.

  2. Copy all your web site files to the new account when you have confirmation that the account is set up (usually within 24 hours).

  3. Place a hidden tag of comment text that distinguishes it from the file on the old web server on your home page file saved on the new web server, like this:

    	<!-- new host -->
  4. Preview how the site will look by connecting to it with your web browser using the IP number of the new web server or a preview URL provided by the hosting company (e.g., http://yourdomain.newhost.com ).

  5. Notify your domain registrar that you want to change the DNS server information for your domain to those maintained by your new hosting company when you’re satisfied that the site on the new host’s servers looks and behaves like the site on the current host’s servers. (They should give the IP numbers and/or host names of their DNS servers when you sign up.)

Usually you can update your DNS information via your registrar’s web-based control panel for your account. At this point, the waiting period for the DNS change begins, so any changes you make to your site during this period must be made to files on both the old and new hosting account. I prefer to pull the trigger on DNS changes on a Friday, let the propagation occur over the weekend, and then check the site on Monday. Viewing source and finding the hidden tag confirms that the propagation is almost, if not entirely, done. By the middle of the week, you can cancel your old hosting account.

Consider setting up a third-party, backup DNS service.

This allows you to respond to web site outages quickly. If your hosting provider’s DNS server goes down, then your web site will be down, too. For a nominal fee, you can set up a backup DNS listing through a company such as Ultra DNS to avoid this situation.

See Also

SnapNames (http://www.snapnames.com/) and NameProtect (http://www.nameprotect.com/) provide alert services for careful domain owners and administrators. The World Intellectual Property Organization Arbitration and Mediation Center, online at http://arbiter.wipo.int/center/index.html, works to resolve international intellectual property disputes. Ultra DNS, at http://www.ultradns.com, provides backup DNS listings to avoid web site downtime if your primary DNS services becomes unavailable.

1.3. Choosing a Server Platform and Hosting Plan

Problem

You need to narrow down the myriad web hosting choices to the best one for your web site.

Solution

First, consider which web server software and platform your site will be built on; open source Apache and Microsoft’s Internet Information Servers (IIS) are by far the most common, although a handful of other web server applications offer special options for companies with particular web site needs. Then think about what features you may need—such as an e-commerce platform, SQL database, secure shell access, or phone-based tech support—to determine whether you should pay for a third-party hosting service or become your own webmaster and host the site yourself.

Discussion

Ten dollars a month will buy you a lot of web hosting, and $100 a month will buy you more than you ever knew you needed. Free hosting is worth just a little less than what you pay for it. And hosting your own site, especially if you’ve got real work to do, may ruin what little love for computers you may have. Before you let your cousin Mickey host your site from the server farm in his basement or jump at the first web hosting deal you find, spend time doing some long-range planning about how your web site may grow and change.

About 85 percent of the sites on the web these days are running either Apache, an open source and free descendant of the httpd code that served the first web pages, or IIS, a commercial application from Microsoft that is built into server versions of Windows. The rest of the web is covered by lesser-known server software such as Lotus Domino from IBM, Netscape Enterprise Server, Zeus, and StarNine’s WebStar (among others).

Rather than present a biased pro and con of the two leading choices, here are some neutral observations and facts about Apache and IIS:

  • Large corporations overwhelmingly favor IIS for their web sites.

  • Apache has about 70 percent of the total web server marker, according to Netcraft.

  • Some hosting providers offer only Apache or IIS, although some offer both. The cost of similar plans on either platform are comparable.

  • IIS has a configuration utility with a graphical user interface (GUI).

  • Apache is best configured through text files and shell-prompt commands, although there is a GUI for Apache called Comanche.

  • Both Apache and IIS will run Perl and Python scripts, as well as JavaServer Pages applications.

  • Both can access SQL databases, but IIS has the advantage of better integration opportunities with Microsoft’s Windows-only desktop database application, Access, as well as Word and Excel.

  • The server-side scripting languages PHP and Microsoft’s Active Server Pages (ASP) can run on either platform, too. PHP is somewhat more common with Apache. Apache also requires an additional component to process ASP pages, while ASP is built into IIS.

  • Apache runs on all common Unix-flavor servers, as well as Windows. IIS only runs on Windows.

  • Both are fast, well-supported, and stable.

  • Few of your web site visitors will know or care which you use, and none will notice any difference in how your site behaves.

The choice usually comes down to a matter a personal preference. This book is geared toward web sites running Apache, so, after you select your (Apache-based) hosting account, you’ll have just one of many decisions about your hosting setup behind you. Shopping wisely for a place to host your web site will pay off in the long run. Although it’s not impossible to transfer a site from one host to another, the process has been known to ruin more than its share of web designer’s weekends.

With even the most basic, entry-level hosting plans offering more than enough disk storage and data transfer quota for a small- to medium-size web site, what are the features and factors that matter?

Secure-sockets layer (SSL) server options

E-commerce transactions or other transmissions containing confidential information between your site visitors and your web server will require a certificate signed by a third-party certificate authority to verify your web site’s authenticity (Recipe 8.5 covers setting up certificates). In addition to the fee you’ll pay to the certificate authority, enabling the SSL functionality on your account to encrypt the data as it passes over the Internet usually involves a setup fee and ongoing monthly charges. Take those fees into account when choosing a hosting provider, even if you don’t need an SSL server on day one.

Charges for extra disk quota or bandwidth

A mild-mannered web site with predictably modest traffic patterns can easily fall victim to overuse fees when business booms or an unexpected link to the site causes web site activity to spike. Some hosting providers may shut down sites that exceed the account’s allotments or email the owner when there’s a problem. Others may grade your accounts usage numbers “on the curve,” throwing out the high and low numbers and charging you for the average of the remaining days of the month. Be sure you know your hosting company’s fees and policies on exceptional web site activity and how long it would take to upgrade your account and move your site to a better plan if your site’s new-found popularity becomes the norm.

Phone-based technical support

How quickly can you get someone on the phone if there’s a problem? Hello?

Ease of adding other domains and web sites

At some point, you may want to host a second domain name and web site on your existing hosting account. Your hosting company may see this as new revenue stream from you to them. If you need to host more than one site, look for a provider with the most reasonable fees for this service.

Anonymous FTP access

A no-login-required “drop box” on your web server can be a faster, more convenient, and more reliable way to receive large files from your site visitors than receiving them as email attachments.

Secure-shell access

Many of the solutions to Recipes presented in this chapter require shell or command-line access to your server through a Telnet connection. A better, more secure method of connecting to your server to get a shell-prompt for running commands is through a secure shell connection. Some providers may require that you request this feature in writing before enabling it. Look for a provider that offers this feature and take the steps necessary to enable it.

Backups

The local files/remote files site management setup of popular WYSIWYG web site editing applications (such as Dreamweaver) automatically keep a backup of web site files on the hard drives of one, or more, of the people responsible for the site. But this type of backup doesn’t include all the crucial files. Make sure your hosting account includes a regular backup scheme—preferably with an archive of older backups—that covers everything on the site: databases, CGI scripts, logs, and the like.

See Also

Recipe 8.5 on setting up self-signed certificates to work with SSL.

1.4. Enabling Server-Side Includes

Problem

You need to display the contents of one or more shared files in the body of your web pages.

Solution

Configure your web server to parse include tags for all files, or rename your files using the server-side include (SSI) friendly suffix for files that will be parsed.

A typical web server installation will have the module for parsing SSI tags enabled by default. If you have the ability to open and modify your Apache configuration file, check to make sure the following two lines are not commented out.

Tip

The location of Apache’s configuration file—httpd.conf—is set at installation. The default location is /etc/httpd/conf/httpd.conf. A commented, or inactive, line in the configuration file is preceded by a pound sign (#).

The two lines you’re looking for should be near the top of the file:

	LoadModule includes_module libexec/httpd/mod_include.so

and:

	AddModule mod_include.c

Any change you make to the file will require a web server restart to take effect (see Recipe 1.9). A file’s suffix determines if it is eligible for parsing. Typically, files ending with .shtml are parsed for includes, but files ending with the more familiar .html will not. Since most web page editors create files ending in .html—and most visitors to your site will assume that pages on your site end with .html—it’s a good idea to stick with that naming convention and enable SSIs on .html files, too.

Now, go back to the Apache configuration file, where you should find two lines together like this:

	AddType text/html .html
	AddHandler server-parsed .shtml

On that second line you want to add “.html” so it reads like this:

	AddHandler server-parsed .shtml .html

If you don’t have access to the master configuration file, you can still change the way Apache parses the files for your site with an .htaccess file. Just create the file in your web site root directory and paste in the first and third lines of code above, like this:

	AddType text/html .html
	AddHandler server-parsed .shtml .html

Tip

A web server restart is not required when you use this method.

Discussion

Server-side includes are one of the most powerful, yet easy to use, tools in a web designer’s bag of tricks. Before the days of reliable web page templates built in a WYSIWYG web page editor or a content management system, SSIs were just about the only way to ensure web site consistency across a multipage site. Server-side includes allow a web designer to save shared content in a single file and display it on multiple pages.

Tip

If you’re building even a modest-size web site, make sure that SSI functionality is available for every page on your web site.

Server-side includes have other magical powers, like displaying the date of a web page’s last modification and executing and displaying the results of a CGI script in an otherwise static web page. We’ll come back to these techniques in Chapter 4.

Bear in mind that parsing every .html file on your site for includes puts an extra load on your web server. You should not notice a decrease in your web site’s performance if you follow a couple of guidelines about using SSIs. First, keep the number of SSI files included on your pages to a minimum. If you have two or three includes strung together in your page code, combine them into one file if possible. Don’t build your pages out of includes.

Also, don’t nest include files inside other include files. Since Apache parses files for includes based on the file suffix, give your include files—even though they might contain HTML code—a distinct suffix such as .inc or .ssi so Apache won’t look through the include file for more includes to parse.

1.5. Setting the Default Filename for a Directory or Entire Site

Problem

You need to tell your web server the name of the default page for a given directory or all directories on your web site.

Solution

Add or modify the DirectoryIndex entry in your httpd.conf file, or a specific direc-tory’s .htaccess file. List the files that should be treated as default pages in the order you wish them to be served:

	DirectoryIndex index.php index.html index.htm index.php3 welcome.html

Discussion

When a visitor to your web site requests a URL without a specific filename—say, http://yourwebsite.com/news/—the web server needs to decide which page to send back to the browser. The file can have any name, and be a static file or one that is dynamically generated. Regardless of how the file is created, if it’s missing when requested, then your visitors will see an ugly list of every other file in the directory or, worse, a 403 Forbidden error telling them they don’t have permission to access the directory.

Tip

For more about denying auto-indexing of a directory, see Recipe 5.5.

As you saw in Recipe 1.4 on server-side includes, the setting that determines the name of a directory’s default page resides in the main Apache configuration file, and may be overridden by an .htaccess file. Apache configuration changes listed in an .htaccess file apply to all of the files in the same directory and to all of the files in subdirectories below it that don’t have their own .htaccess file.

The line to look for in the configuration file—or to add to an .htaccess file—looks something like this:

	DirectoryIndex index.php index.html index.htm index.php3 welcome.html

The setting can contain more than one default file option, listed in descending order of priority. In this example, the server will look for index.php first, then index.html, and so on down the line.

See Also

Recipe 1.4 on using configuration files to enable SSIs.

1.6. Making Sure Your Web Site Loads With and Without the “www” Prefix

Problem

You need to make sure that your web site can be accessed both with and without the “www” prefix.

Solution

If your web site won’t load without the “www” prefix—or it won’t load with “www”—then you may need to make a change to your DNS record. Some hosting providers allow customers with higher end accounts to change their own DNS records, but use caution if your account includes this feature and you’re not sure what you’re doing. If you’re in doubt, contact your hosting company for clarification or guidance.

If you have access to the command-line network tools nslookup or dig, either on your own PC or through a Telnet shell provided by your hosting account, you can investigate the details of the various listings in your DNS record without changing them. Some web-based tools (see the “See Also” section in this Recipe) can access the same DNS record information if you do not have access to dig or nslookup.

In the example below, a dig request on www.daddison.com shows it to be a CNAME, or canonical name, listing for daddison.com. A CNAME listing is an alias to the main, or A RECORD, listing in the domain name’s DNS record. Requests for either web site address—with or without the “www”—are answered with my web site:

	Lookup has started …

	; <<>> DiG 9.2.2 <<>> www.daddison.com any
	;; global options: printcmd
	;; Got answer:
	;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40212
	;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 1

	;; QUESTION SECTION:
	;www.daddison.com. IN ANY

	;; ANSWER SECTION:
	www.daddison.com. 3600 IN CNAME daddison.com.

	;; AUTHORITY SECTION:
	daddison.com.     3600 IN NS ns22.pair.com.
	daddison.com.     3600 IN NS ns0000.ns0.com.

	;; ADDITIONAL SECTION:
	ns0000.ns0.com.   7110 IN A 216.92.61.2

	;; Query time: 98 msec
	;; SERVER: 151.164.20.201#53(151.164.20.201)
	;; WHEN: Wed Apr 20 15:50:18 2005
	;; MSG SIZE rcvd: 113

Discussion

In the classic 1963 farce “It’s a Mad, Mad, Mad, Mad World,” an all-star cast from Hollywood’s Golden Age raced against each other to find a treasure hidden under a big “W.” In the late 1990s, during the Internet’s first (we hope) Golden Age, legions of dot-com entrepreneurs also sought riches, in this case, under “www.” With the wisdom of hindsight—and perhaps with a renewed appreciation for the farcical and tongue-tying nature of hyper-alliterative repetition—many web sites now present themselves on the World Wide Web sans “www.”

But old habits die hard. Although it’s now rare to find a reputable web-hosting provider that does not configure its servers to respond to browser requests for a web site both with and without the “www” prefix, the visitors to your site—especially novice web users—may assume that typing in the “www” is a requirement for getting to your web site.

From the perspective of the hosting company, the prefix is simply a shorthand for the type of Internet resource and server at their data center that handles the request. In a typical scenario, the DNS record for your domain has been configured to redirect requests starting with (or without) "www” to a host server that handles web requests. The DNS record may also have entries for “ftp,” “mail,” or other services enabled on your domain that are handled by other host servers.

There are also some good reasons to direct all of your visitors to your web site without the “www” prefix. Not only is it a time-wasting mouthful when spoken aloud in a broadcast or voicemail auto-attendant message, but some web site functionality—such as session cookies and SSL certificates—may be valid for web site addresses without the “www,” but not with it.

If your web server has the rewrite module enabled, you can create rules that tell Apache to seamlessly change the URL requested by the browser to something else. For example, requests for http://www.domain.com become http://domain.com. To do this, create or modify the .htaccess file in your web root directory with a rewrite rule to remove the “www.” from browser requests to your web site. First, find or create an .htaccess file. Then copy into it the code shown below, replacing domain.com with your domain name:

	RewriteEngine On
	RewriteCond %{HTTP_HOST} ^www\.domain\.com$ [NC]
	RewriteRule ^(.*)$ http://domain.com/$1 [R=301,L]

See Also

No-WWW is an online campaign to purge the “www” from web site addresses. Visit the site at http://no-www.org. Sam Spade Tools offers a variety of web-based domain tools at http://www.samspade.org/t/.

1.7. Creating and Accessing Directories Outside the Web Site Root Directory

Problem

You need to place some files out of the reach of HTTP requests.

Solution

Put your web pages in a directory one level below your login’s home directory and create and use directories at the same level as the “web root” to hide private files.

Discussion

I’m always a little annoyed when I FTP or Telnet into a new client’s web server for the first time and find that the home directory for the hosting account and the root directory for the web site are one and the same. Such setups are typical of basic hosting accounts and likely keep tech support calls to a minimum by eliminating the need for novice and do-it-yourself web designers to remember file paths on the web server when uploading their web pages. Just FTP into your web server and the site files are right there in front of you.

In these bare-bones web site setups, any file that gets uploaded to the hosting account home directory, or that is created by an automated process on the web site, can potentially be viewed and downloaded over the web. Sensitive or restricted data can include files containing your weblog, email list, or credit card merchant account passwords, database login information or backups, downloadable files that are only made available to authorized site visitors, auto-generated order logs containing confidential information about your online customers, or future versions of important pages on your site to be published at a later date.

Restricted file permissions and password-protected directories are among the other popular methods of keeping private files out of reach of browser requests. But keeping files in a directory that’s not even part of your web site (password protected or not) makes worrying about unauthorized web access unnecessary.

First, you’ll need to relocate your web site’s root directory with a few Apache rewrite rules placed in an .htaccess file, a technique first encountered in Recipe 1.6. You can start using this technique with either a new or existing web site. The changes will be immediate and transparent to your visitors.

Create a directory at the top level of your hosting account home directory—in other words, at the same level as your current home page HTML file. Call the directory something obvious, such as www, web, or htdocs. I’m going to use htdocs for the example below. Now create or modify the .htaccess file in your home directory. Copy into it the following rules for redirecting requests to your domain name to files in the htdocs directory:

	RewriteEngine on

	RewriteCond $1 !^htdocs/
	RewriteRule (.*) / htdocs /$1 [L]

	RewriteCond %{THE_REQUEST} ^[A-Z]+\ / htdocs /
	RewriteRule .* - [F]

The first line ensures that the Apache rewrite engine is on. Lines two and three invisibly redirect browser requests to files in the htdocs directory, but keep the rule from looping indefinitely. Because the rules in the .htaccess file apply to the directory the file is in as well as all the directories below it—including our new htdocs directory—line 3 prevents Apache from appending an infinite number of htdocs to the browser request. Lines 4 and 5 prevent direct requests for files in the htdocs directory.

Copy all your web site files and directories into the htdocs directory, delete the original web site files and directories at the top level of your user account, and you’re done. Now you can create other new directories at the same level as htdocs, such as includes, backups, and downloads. None of these new directories will be web accessible.

You can still use standard server-side include tags, introduced in Recipe 1.4, to reference files in the new “super” includes directory, like this:

	<!--#include virtual="/includes/ssi_file.inc" -->

You can also add includes to your web pages that are saved in an includes directory, within your new htdocs directory, like this:

	<!--#include virtual="/htdocs/includes/ssi_file.inc" -->

If your web site uses the popular server-side scripting language PHP, you can configure your scripts to read or write files to directories outside the htdocs directory. Apache’s includes functionality begins looking for files at the DocumentRoot specified in its configuration file, which, despite our rewrite rule, is the home directory in our example. PHP, on the other hand, can roam the entire file system of the server, which can make tracking down the exact path to the file you want to include a bit of a mind bender.

Fortunately, you can usually specify a base include path for PHP in your .htaccess file. Open the .htaccess file that you put the rewrite rules into earlier and add the full server path to your home directory.

Tip

If you don’t know the full server path to your home directory, use the pwd utility at a command line to your web server to find out.

The line in your .htaccess file should look like this:

	php_value include_path .:/path/to/your/hosting/account/home/directory/

An include statement in one of your PHP scripts that refers to a file above your new htdocs directory would look like this:

<? include ("includes/ssi_file.inc "); ?>

1.8. Automating Routine Tasks

Problem

You need to publish or change files overnight while you’re sleeping.

Solution

Use your web server’s built-in task scheduling utility, cron, to do the work for you.

Discussion

Web designers like their sleep just like anyone else—maybe more. The last thing any of us want to do is stay up late or get up early to post new information on a web site according to the boss or marketing department’s schedule.

Fortunately, Unix-based web servers come with a built-in task-scheduling utility called cron that can do everything from executing simple file operations—like copying an updated web page from a private directory to a public one—to running complete scripts with instructions for more complex site maintenance routines. Let’s look at the basics of cron and how you can use it to do simple site updates when you’re otherwise busy having a life.

Say, for example, that your company’s public relations department is working on an important news release for a new product announcement. The release is embargoed (held back from public view) until Wednesday morning, but they give you the final text of the release on Monday to build a page for the web site. The PR department wants the release to be posted on the site at 6:00 a.m. Eastern Standard Time, but your office is in Denver—two hours behind the East Coast—and you plan to be watching the back of your eyelids at that time. It’s cron to the rescue!

First, create the updated web page (newsrelease.html) and upload it to your web server. If you’re feeling lucky, and don’t think that URL-fishing site visitors or Google will find the page before the embargo date, put the new page in a new public directory on your web site. If you’re worried about your job security, you’re paranoid, or both, put the file in a directory that is password protected, hidden, or outside the root web site directory on the server. Either way, cron will be able to find and move the file when the time comes.

Now, tell your web server to use cron to move the file to the URL that the PR department will announce on Wednesday morning. Your web server stores the list of tasks, or cron jobs, that it will run for your account in a file call a crontab. From the command-line prompt to your server, type crontab -l to list the tasks. Assuming there are no tasks yet, the server will respond with something like “no crontab tab for user doug.” Type crontab -e to create a new crontab file using a command-line text editor.

crontab entries start with the time and day on which they should run, followed by the command. They are listed in this order:

  1. Minute (0–59)

  2. Hour (0–23)

  3. Day of the month (1–31)

  4. Month of the year (1–12)

  5. Day of the week (Typically 0–6, with Sunday being 0 and Saturday 6, but some systems may use 1–7, starting with Monday. Double-check your system to be sure.)

Using the Unix move utility, mv, the crontab line for your scheduled site update should look like this, assuming the server is in the same time zone as you are:

	0 4 * * 3 /bin/mv /private/newsrelease.html /public/newsrelease.html

Alternately, if the embargo date is the 15th of the month, you can use this line in your crontab:

	0 4 15 * * /bin/mv /private/newsrelease.html /public/newsrelease.html

Asterisks are wildcards that cron will use to run the task on any day of the month, month of the year, and so forth. A crontab entry that begins with five consecutive asterisks will run on every minute of every day of the year.

Tip

A crontab entry with five consecutive asterisks may also generate an email from your web server’s system administrator if the scheduled task encroaches on server performance.

Tweaking your crontab

To schedule a recurring task, say every 15 minutes, use this syntax:

	*/15 * * * * /bin/mv /private/newsrelease.html /public/newsrelease.html

If your web server is located in a different time zone than you are, you can tell cron to use your local time for running automated tasks. Add a time zone configuration line to the top of your crontab like this:

	TZ=US/Central

Note that you can’t easily change the overall time zone setting for your web server—especially in a shared hosting setup—because Apache takes its time setting from the server’s system clock. If getting the correct time zone for things like time and date stamps on files and order receipts is important, consider hosting your account on a server that resides in your time zone. Or upgrade your hosting account to a virtual or dedicated server, which may give you more control over the server’s clock, even if the server itself is in a different time zone.

Every time cron runs an automated task from your crontab, it will send an email to your login account’s default inbox detailing the results of the command. Feedback from cron provides valuable debugging information, but emails from frequently recurring tasks can choke your inbox and eat up the disk quota on your hosting account. To turn off the notifications, add this line to top of your crontab:

	MAILTO=""

Tip

You can also create and modify your crontab in a text editor such as NotePad or BBEdit and then upload the file to your server.

Be sure to make the file you upload executable by the owner (you) by using the chmod utility to change the permissions on the file with this command:

	chmod u+x /path/to/my_crontab_file

Then use this command to load the entries in your file into the server’s notion of your crontab:

	crontab /path/to/my_crontab_file

Then double-check that the crontab was loaded correctly from your file by typing:

	crontab -l

This should output your crontab file exactly as you entered it:

	*/15 * * * * /bin/mv
	   /path/to/your/privateorhidden/directory/newsrelease.html
	   /path/to/your/public/directory/newsrelease.html

1.9. Restarting Your Web Server

Problem

You need to restart the HTTP daemon that processes requests for web pages on your server.

Solution

At the command-line prompt for your server, issue the apachectl graceful command, or the appropriate restart command for your web server.

Discussion

Restarting your web server has come up in several of the topics covered in this chapter. When you modify the configuration file for Apache, you have to restart it for any changes to take effect.

Basic webhosting accounts usually share Apache server processes with other web sites, so if that’s the case with your web site, your provider may not want or allow you to restart Apache. Web designers with higher priced virtual server accounts, or accounts running on a dedicated server, have Apache all to themselves and usually can issue the commands for stopping and starting it as needed.

Later versions for Apache install with a control script called apachectl. With it, you can start, stop, and restart the HTTP daemon on your dedicated server or “virtual server” account.

Finding the script

You should be able to use apachectl at the command-line prompt to your web server by typing its name followed by a space and stop, start, or graceful. If that does not work, you will have to specify the full path to the script. To locate the script on your server, use one of these commands:

	find / -name apachectl

or:

	which apachectl
Stopping and starting Apache

The results of the commands apachectl stop and apachectl start are self-evident.

Warning

The stop command immediately turns off the server, cutting off connections that may still be in the process of downloading pages from your web site.

Gracefully restarting Apache

A better way to restart Apache after a change to the configuration file is with the graceful argument to apachectl. In this case, Apache leaves current connections to browsers open and starts applying the changed settings to new connections.

See Also

The Apache Software Foundation web site has more information about stopping and starting the Apache web server on the manual page for apachectl at http://httpd.apache.org/docs/programs/apachectl.html and a guide to restarting Apache at http://httpd.apache.org/docs/stopping.html.

1.10. Monitoring Web Server Activity

Problem

You want to see programs your web server is running and user requests for web pages.

Solution

Use command-line tools to get a real-time snapshot of web server activity:

tail

Returns the last part of a file, such as most recent connection entries from the web server logfile

grep

Searches for a pattern in a file, such as specific filenames or error codes from the web server logfile

ps

Reports on the status of web server processes

Discussion

Almost any decent web hosting account will record connections to your web site in logfiles that you can view and process. A good hosting provider may even help you automate the task of purging the connection records—or log rolling—so the files do not consume your account’s disk quota, and give you access to web site statistics software, such as Analog or Urchin, that will generate easy-to-read reports about activity on your web site.

If you’re serious about your web site, then you should take advantage of the tools available to you and review web site traffic reports often to understand how visitors get to your site, what’s popular, and what’s working (or not working). How to look at and use web site traffic reports is covered in Recipe 9.9.

The access and error logs that provide the raw material for traffic reports are constantly updated. Traffic reports themselves, on the other hand, are usually generated less frequently—daily, or even weekly, in some cases. A situation may arise when you can’t wait for the next traffic report to be created. You need to get an up-to-the-minute picture of the who, what, and how many of your web site’s current activity. Here are some command-line tools you can use to take your web site’s pulse.

Using tail to track web site requests in real time

First, you’ll need to find your Apache access and error logfiles. They are usually saved in a separate logs directory and have names like access_log, access.log, or apache.access_log. The error log should be in the same directory with the access log, so once you’ve found the logs, Telnet into your web server and switch to the logfiles directory.

Now you can watch connections to your web site as they’re handled by Apache with the Unix utility tail. Assuming your access log is named access_log, type this command at your Telnet prompt:

	tail -f access_log

Your shell window should be filled with several lines, like this:

	128.118.152.116 - - [14/May/2005:12:49:26 -0500] "GET
	/swgr/index.php HTTP/1.1" 200 29070
	"http://daddison.com/index.html" "Mozilla/4.0
	(compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
	68.142.250.83 - - [14/May/2005:12:49:30 -0500] "GET
	/case_studies/cs01.html HTTP/1.0" 200 19604 "-" "Mozilla/4.0
	(compatible; MSIE 5.01; Windows NT; .NET CLR 1.1.4322)"
	165.83.120.231 - - [14/May/2005:12:49:33 -0500] "GET
	/clients/index.html HTTP/1.1" 301 255 "-" "Mozilla/4.0 (compatible;
	MSIE 6.0; Windows NT 5.1; SV1)"

Each line indicates the IP number, file requested, and status of each unique connection, or hit, to your web site. The -f flag on the command tells tail to show the last 10 lines in the access log, and to echo new lines to the shell window as they are appended to the file. See for yourself: open a browser window and, with your shell window still visible, hit a page on your web site. Your request should be duly noted by tail.

Using grep to find specific requests in the web server log

Going back to the problem in Recipe 1.8 about automatically updating pages on your site, let’s say that your boss wants to know how many hits to the company’s latest news release have been recorded today. And she can’t wait until tomorrow, when a nice and neat traffic report will be waiting on the site with the answer. With grep, you can narrow your focus on the access log to just see recent requests for a specific file.

At the Telnet prompt to your web server, you can instruct the grep utility to search the access log for the filename of the news release in the content of the current access log by typing this command:

	grep "GET /news/newsrelease.html" access_log

With the search string GET /news/newsrelease.html you’re looking for all the requests for newsrelease.html in the /news directory in the current server log. The results might look like this:

	24.91.149.141 - - [14/May/2005:13:55:45 -0500] "GET
	/news/newsrelease.html HTTP/1.1" 200 18912 "-" "Mozilla/4.0
	(compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
	213.219.80.16 - - [14/May/2005:13:56:36 -0500] "GET
	/news/newsrelease.html HTTP/1.1" 200 18912 "-" "Mozilla/4.0
	(compatible; MSIE 6.0; Windows 98; Win 9x 4.90)"
	70.176.205.66 - - [14/May/2005:13:58:09 -0500] "GET
	/news/newsrelease.html HTTP/1.1" 200 18912 "-" "Mozilla/4.0
	(compatible; MSIE 6.0; Windows NT 5.1; SV1)"

You can also send the results of the search to file by modifying the command like this:

	grep "newsrelease.html" access_log > newsrelease_report.txt

And if you want to get really fancy, you can put that second grep command in your crontab file, have it run every 15 minutes, and let the boss check the hits herself.

You also can use grep to sift the access log for errors and unsuccessful requests that visitors to your web site are encountering. Each line in the log also includes an error code indicating the result of the request. Some common error codes are shown in Table 1-2. For a complete list, see the World Wide Web Consortium (W3C) list referred to in the “See Also” section of this Recipe.

Table 1-2. Common error codes

Code

Meaning

200

OK, the request has succeeded

401

Unauthorized, the request requires authorization

403

Forbidden, the request was refused

404

Not found

500

Internal server error

Using ps to monitor web server processes

Finally, there may come a time when you want to see what processes are running under your user ID on your web server. Use the Unix process report utility— ps—with this command, replacing userid with your own ID (right after the -U flag):

	ps -Uuserid

The results should look something like this, with httpd indicating Apache processes that are currently running on your web server:

	PID    TTY      TIME CMD
	11565  ?        0:00 httpd
	 1715  pts/5    0:00 tail
	11569  pts/6    0:00 tcsh
	11560  ?        0:00 httpd
	11567  ?        0:00 sshd
	11512  ?        0:00 sh
	11542  ?        0:01 httpd
	29475  ?        0:01 sshd
	29477  pts/5    0:00 tcsh
	 6373  ?        0:00 sshd
	11559  ?        0:00 httpd
	11578  pts/6    0:00 ps
	11557  ?        0:00 httpd
	11553  ?        0:00 httpd
	11554  ?        0:00 httpd

See Also

For a complete list of HTTP status code definitions, see the W3C page at http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html.

1.11. Building an Easy-to-Maintain Web Site with Free Tools

Problem

You need to set up a small web site and are willing to sacrifice some customization options in favor of saving money and getting it online quickly.

Solution

Employ a combination of free or inexpensive resources available on the Web to build a low-cost site that’s easy to maintain. The ingredients for this Recipe are:

  • A domain parked at a registrar that allows you to forward requests for the domain to another URL

  • A small amount of free hosting space provided by your internet service provider, school, employer, or other reliable web server operator

  • One or more blogs hosted by Blogger (or another free blogging service)

  • A free Flickr account for storing and sharing images you want to display on your site

  • A free del.icio.us account for managing links on the site, including its navigation

  • A Google-based site search form

Discussion

Although the rest of this book is devoted to in-depth solutions for building a substantial, highly customized web site, there are times when you need to get something online fast, cheaply, and under control. Fortunately, a slew of free or inexpensive web services have become available recently (some referred to under the banner of Web 2.0) that make doing so fairly easy.

As I explained in Recipe 1.1, web sites start with a domain name. Expect to pay $5 to $10 for a one-year registration, although you may find a cheaper deal for a domain in one of the newer top-level domains (such as .info or .biz) if you shop around. For this Recipe, I registered the domain dougaddison.info at GoDaddy.com (http://www.godaddy.com). When choosing a registrar for the site, make sure you can “park” the domain on their DNS servers for free (or a nominal fee) and forward requests for the domain to another URL. In addition to free parking and forwarding, GoDaddy also lets registrants “mask” the forwarded domain, which means that the browser location window will always display the domain name (dougaddison.info), even though the pages themselves will be served from another URL.

Next, you’ll need to find a small amount of hosting space for the site. I found mine through my internet service provider—SBC—who, through a partnership with Yahoo!, gives its customers a free GeoCities account with 15MB of disk space. The GeoCities control panel also has a web-based file manager for uploading and editing web pages stored on the account (see Figure 1-1). In my GoDaddy control panel, I set dougaddison.info to forward to . Because I also instructed GoDaddy to mask the domain, visitors to the site will never see the GeoCities address.

Tip

For a design, I turned to another free online resource—the Layout-o-Matic at inknoise.com. Chapter 3 features additional resources on layout and color, including additional free resources for downloading pre-coded design templates and color schemes.

A free Blogger account solves the content management problem. With its user-friendly web-based writing and editing interface, Blogger’s blogging tools circumvent the need for your less computer-savvy site contributors to set up an FTP client and understand the process of uploading files to the web server. Blogger also offers a variety of design templates for displaying your blog at an address on their server (i.e., dougaddison.blogspot.com). But for this site, you will self-syndicate an RSS feed from your blog and display it on a page that you upload just once to your free hosting space. See Recipe 6.7 for a discussion of three methods for doing this.

Free web space from GeoCities comes with a web-based file manager for uploading and editing web pages
Figure 1-1. Free web space from GeoCities comes with a web-based file manager for uploading and editing web pages

Self-syndication will be the key to adding navigation and images to the site, and free accounts with del.icio.us and Flickr will provide the tools for doing so. As darlings of the Web 2.0 movement, Flickr and del.icio.us are leading the way in opening the web to new ways of managing images and links on the web. The tagging features of both services—dubbed folksonomy for their grass-roots inversion of traditional top-down categorization, or taxonomy, of online resources—enable novel and inspiring ways of communal publishing and sharing with the web.

At the most basic level, del.icio.us is an online bookmark storage service. With it, you can ditch the bookmark list that your browser saves on your PC’s hard drive and have access to your favorite sites from any browser on any computer that you use. You also can define your own system for categorizing your bookmarks with one or more tags that you assign to each bookmark you add to your del.icio.us account. Flickr works in a similar way, but with images. A free Flickr account provides 20 MB of image-upload storage each month, as well as tools for tagging individual images, generating code for displaying them on another web site, uploading images automatically from your cell phone’s camera, and posting the images with a short description to a blog hosted by another service (including Blogger).

Best of all for your fast, cheap, and under control site, both del.icio.us and Flickr generate RSS feeds for each tag that you define. So in my del.icio.us account I defined two tags for the links I want to display on the site: “sitenav” for the internal links and “sitelinks” for other web sites that I want to link to from dougaddison.info. Then, I plugged in the self-syndication code for the two tag feeds into the pages where I want those links to appear: “sitenav” in the sidebar of every page and “sitelinks” on my Links page (see Figure 1-2).

My quick, basic site uses free tools from Blogger, Flickr, del.icio.us, and Google
Figure 1-2. My quick, basic site uses free tools from Blogger, Flickr, del.icio.us, and Google

Likewise with Flickr, I created a tag called “worksamples,” uploaded some screenshots of web sites I’ve worked on recently along with a short description, then copied the self-syndication code generated by a tool described in Recipe 6.7 onto my Work Samples page. Alternatively, you can post Flickr images to your Blogger blog directly from Flickr (which will cause images and your other text-only posts to be displayed together) or create a second images-only blog on your Blogger account, post your Flickr images to it, and then display those posts separately on a different page.

Finally, you can easily add a site-wide search tool with a free tool from Google or copy the code below and replace YOUR DOMAIN NAME with your domain:

	<!-- SiteSearch Google -->
	<FORM method=GET action="http://www.google.com/search">
	<input type="hidden" name="ie" value="UTF-8">
	<input type="hidden" name="oe" value="UTF-8">
	<TABLE bgcolor="#FFFFFF"><tr><td>
	<A HREF="http://www.google.com/">
	<IMG SRC="http://www.google.com/logos/Logo_40wht.gif"
	border="0" ALT="Google"></A>
	</td>
	<td>
	<INPUT TYPE="text" name="q" size="31" maxlength="255 "value="">
	<INPUT type="submit" name="btnG" VALUE="Google Search">
	<font size="-1">
	<input type="hidden" name="domains" value="YOUR DOMAIN NAME"><br><input
	type="radio" name="sitesearch" value=""> WWW <input type="radio" name="sitesearch"
	value="YOUR DOMAIN NAME" checked> YOUR DOMAIN NAME <br>
	</font>
	</td></tr></TABLE>
	</FORM>
	<!-- SiteSearch Google -->

See Also

For more information on the techniques described in this Recipe, see Recipes 1.1 and 6.7. To sign up and begin using the free tools, visit del.icio.us (http://del.icio.us), Blogger (http://blogger.com), Flickr (http://flickr.com), and Google Free WebSearch (http://www.google.com/searchcode.html).

Get Web Site Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.