I’m assuming in this book that you are already experienced with HTML and with the basics of setting up a web site. That means you probably already have a hosting arrangement of some sort: a web server either within your own company or at an outside Internet Service Provider (ISP). To keep things simple, I’m going to assume that you’re hosting with an outside ISP, but in a large company it is quite possible that your “provider” will be an internal company department. This is especially likely if you are developing content for a company intranet, rather than the public Internet. It really doesn’t matter, though. The only question is, can your hosting partner, wherever that partner is located, deliver what you need?
As your web efforts grow more ambitious, you need to think carefully about your hosting arrangements. One of the things I’ve learned as I’ve gone from maintaining a personal home page to maintaining multiple commercial sites is the number of ways in which a site can outgrow its hosting environment. After a fair amount of (sometimes painful) experience, I’ve come up with the following list of attributes that I now look for when evaluating a hosting provider for my own web projects.
The first four items in the following list (Unix environment, shell
access, cron
capability, and CGI
scripting/server-side includes) represent the minimum needed to
implement the examples in this book. The remaining items may be more
or less important depending on what you’re trying to do.
- Unix environment
The examples in this book assume that your web server is running Unix (loosely defined); specifically, the open source variant of Unix called Linux. Linux web servers are quite popular with ISPs because they deliver powerful, stable performance at a minimal cost in terms of hardware and software.
- Shell access
The ability to log into a Unix shell session will also be necessary for running the examples in this book. You will use the Unix shell to run Perl scripts and various command-line utilities. This book assumes you will be using the
bash
shell (which is the default shell in Linux), but most of the examples will work more or less the same under other shells.- Cron capability
Along with the ability to log into the Unix shell, you should look for a hosting setup that allows you access to the Unix
cron
facility, which lets you run programs on the server at specified times, regardless of whether you are logged on. Such cron jobs are extremely useful for making backups, analyzing log files, checking various aspects of server operation, automatically notifying you if there is a problem, and so on.- CGI scripting/server-side includes
The ability to generate pages dynamically is an important feature of many high-end web sites. Because the CGI scripting or server-side includes (SSIs) used to achieve this effect can consume a lot of resources on the server (and can also represent significant security risks), inexpensive hosting packages often do not include access to them. Some of the examples in this book assume you will be able to use them, though.
- Disk space
When you are creating all of your HTML pages manually, 10- or 20MB of disk space can seem like a lot of room. If you are going to have a lot of graphics or multimedia files, though, you’ll use it up in a hurry. Using Perl to generate web pages from large data files, which we’ll be doing later on, also burns through disk space like there’s no tomorrow. Log files, reports, backups; all of them take space—and disk space is cheap compared to the time and effort that go into filling it up. Plan accordingly.
- Secure server capability
The ability to support encrypted web transactions is important if your site is going to be exchanging sensitive information, such as credit card numbers, with its users. Hosting packages that include such “secure servers” tend to be significantly more expensive than those that don’t. None of the examples in this book require secure web server capability.
- Root access
The ability to log into the
superuser,
orroot,
account on your web server is sometimes necessary for things like configuring the server, installing software, and so on (though none of the examples in this book assume that you have such access). Because the root account can do pretty much anything on a Unix server, ISPs aren’t likely to give access to it to just anyone. You almost certainly won’t have access to it if you are in a shared hosting environment. Even then, though, there may occasionally be times when a little superuser help at the right time can make a big difference. This is one reason why it’s a good idea to take your ISP’s tech support people to lunch once in a while (or at least send them something yummy at holiday time). People are a lot more likely to drop what they’re doing to solve your silly problem if their stomachs think kindly toward you.- CPU load
One of the hidden costs of using an inexpensive web hosting package is that you’ll be sharing the server with a lot of other users. Assuming you all have the ability to do interesting things like use CGI scripts and server-side includes, the server might end up doing a lot of work—which can slow things down unacceptably. If your ISP is pursuing a business strategy of adding users as quickly as possible and falls even a little bit behind in terms of upgrading the infrastructure, the situation can go from bad to worse very quickly.
- Uptime/reliability/disaster recovery
Another hidden cost of inexpensive web hosting packages is the low level of staffing and disaster preparation that you may encounter. How often do problems (say, a hard-disk failure on the server) occur? When they happen, how long is it before the ISP’s technical staff becomes aware of it? How long before the problem is corrected? Even if you’re confident in your ISP’s abilities to stay on top of issues like this (and especially if you’re not), inquiring about their backup systems and keeping your own independent backup copies of all important data is a very good idea.
- Connectivity
Network slowdowns and outages are a fact of life on today’s Internet. Depending on what sort of connectivity your site has, such outages can be either a continuing headache or an occasional annoyance. Does your ISP have multiple, independent pipelines to the Internet backbone? Does it have connections with smaller, regional networks so that it can bypass the major routes in case of congestion?
- Bandwidth
Closely related to the question of connectivity is that of bandwidth: how big are the pipes through which your web traffic will be flowing? Depending on how busy your site is, a single T1 line (which carries data at 1.5 megabits/second) may be more than enough—but depending on your hosting situation you may be sharing that bandwidth with a dozen or a hundred or a thousand other web sites.
- Data transfer limits
Some ISPs offer unlimited data transfer, others say your site can deliver only so much data in a given period of time before they shut you down or charge you extra. You’ll have to know how much traffic your site normally generates in order to evaluate what this means to you; as an example, the most popular commercial site I work on (which is not really all that popular by web standards) currently pushes about a gigabyte of traffic on a good day.
- Environmental stability
A good ISP is always working to keep its web hosting platform current: installing new software versions, fixing bugs, tweaking the caching technology, and so on. The downside to this, though, is that you may have built something that relies on some aspect of the old environment. Ideally, your ISP will not change things out from under you without giving you prior warning. In reality, though, depending on your hosting situation, you may not get that kind of notice.
You’ve probably detected a consistent theme in this list: for the most part you get what you pay for in web hosting. A package that seems like a really good deal in one area may be a ticking time bomb in another. For a medium- to high-profile commercial project, you should be prepared to pay for appropriate hosting.
Get Perl for Web Site Management now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.