BUY THIS BOOK
Add to Cart

Print Book $34.95


Add to Cart

Print+PDF $45.44

Add to Cart

PDF $27.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £24.95

What is this?

Looking to Reprint or License this content?


Apache Security
Apache Security By Ivan Ristic
February 2005
Pages: 420

Cover | Table of Contents


Table of Contents

Chapter 1: Apache Security Principles
This book contains 12 chapters. Of those, 11 cover the technical issues of securing Apache and web applications. Looking at the number of pages alone it may seem the technical issues represent the most important part of security. But wars are seldom won on tactics alone, and technical issues are just tactics. To win, you need a good overall strategy, and that is the purpose of this chapter. It has the following goals:
  • Define security
  • Introduce essential security principles
  • Establish a common security vocabulary
  • Present web application architecture blueprints
The Web Application Architecture Blueprints section offers several different views (user, network, and Apache) of the same problem, with a goal of increasing understanding of the underlying issues.
Security can be defined in various ways. One school of thought defines it as reaching the three goals known as the CIA triad:
Confidentiality
Information is not disclosed to unauthorized parties.
Integrity
Information remains unchanged in transit or in storage until it is changed by an authorized party.
Availability
Authorized parties are given timely and uninterrupted access to resources and information.
Another goal, accountability, defined as being able to hold users accountable (by maintaining their identity and recording their actions), is sometimes added to the list as a fourth element.
The other main school of thought views security as a continuous process, consisting of phases. Though different people may name and describe the phases in different ways, here is an example of common phases:
Assessment
Analysis of the environment and the system security requirements. During this phase, you create and document a security policy and plans for implementing that policy.
Protection
Implementation of the security plan (e.g., secure configuration, resource protection, maintenance).
Detection
Identification of attacks and policy violations by use of techniques such as monitoring, log analysis, and intrusion detection.
Response
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Security Definitions
Security can be defined in various ways. One school of thought defines it as reaching the three goals known as the CIA triad:
Confidentiality
Information is not disclosed to unauthorized parties.
Integrity
Information remains unchanged in transit or in storage until it is changed by an authorized party.
Availability
Authorized parties are given timely and uninterrupted access to resources and information.
Another goal, accountability, defined as being able to hold users accountable (by maintaining their identity and recording their actions), is sometimes added to the list as a fourth element.
The other main school of thought views security as a continuous process, consisting of phases. Though different people may name and describe the phases in different ways, here is an example of common phases:
Assessment
Analysis of the environment and the system security requirements. During this phase, you create and document a security policy and plans for implementing that policy.
Protection
Implementation of the security plan (e.g., secure configuration, resource protection, maintenance).
Detection
Identification of attacks and policy violations by use of techniques such as monitoring, log analysis, and intrusion detection.
Response
Handling of detected intrusions, in the ways specified by the security plan.
Both lines of thought are correct: one views the static aspects of security and the other views the dynamics. In this chapter, I look at security as a process; the rest of the book covers its static aspects.
Another way of looking at security is as a state of mind. Keeping systems secure is an ongoing battle where one needs be alert and vigilant at all times, and remain one step ahead of adversaries. But you need to come to terms that being 100 percent secure is impossible. Sometimes, we cannot control circumstances, though we do the best we can. Sometimes we slip. Or we may have encountered a smarter adversary. I have found that being humble increases security. If you think you are invincible, chances are you won't be alert to lurking dangers. But if you are aware of your own limitations, you are likely to work hard to overcome them and ensure all angles are covered.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Web Application Architecture Blueprints
I will now present several different ways of looking at a typical web application architecture. The whole thing is too complex to depict on a single illustration and that's why we need to use the power of abstraction to cope with the complexity. Broken into three different views, the problem becomes easier to manage. The three views presented are the following:
  • User view
  • Network view
  • Apache view
Each view comes with its own set of problems, which need to be addressed one at a time until all problems are resolved. The three views together practically map out the contents of this book. Where appropriate, I will point you to sections where further discussion takes place.
The first view, presented in Figure 1-1, is deceptively simple. Its only purpose is to demonstrate how a typical installation has many types of users. When designing the figure, I chose a typical business installation with the following user classes:
  • The public (customers or potential customers)
  • Partners
  • Staff
  • Developers
  • Administrators
  • Management
Figure 1-1: Web architecture: user view
Members of any of these classes are potential adversaries for one reason or another. To secure an installation you must analyze the access requirements of each class individually and implement access restrictions so members of each class have access only to those parts of the system they need. Restrictions are implemented through the combination of design decisions, firewall restrictions, and application-based access controls.
As far as attackers are concerned, user accounts and workstations are legitimate attack targets. An often-successful attack is to trick some of the system users into unknowingly installing keylogger software, which records everything typed on the workstation and relays it back to the attacker. One way this could be done, for example, is by having users execute a program sent via email. The same piece of software could likely control the workstation and perform actions on behalf of its owner (the attacker).
Technical issues are generally relatively easy to solve provided you have sufficient resources (time, money, or both). People issues, on the other hand, have been a constant source of security-related problems for which there is no clear solution. For the most part, users are not actively involved in the security process and, therefore, do not understand the importance and consequences of their actions. Every serious plan must include sections dedicated to user involvement and user education.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Installation and Configuration
Installation is the first step in making Apache functional. Before you begin, you should have a clear idea of the installation's purpose. This idea, together with your paranoia level, will determine the steps you will take to complete the process. The system-hardening matrix (described in Chapter 1) presents one formal way of determining the steps. Though every additional step you make now makes the installation more secure, it also increases the time you will spend maintaining security. Think about it realistically for a moment. If you cannot put in that extra time later, then why bother putting the extra time in now? Don't worry about it too much, however. These things tend to sort themselves out over time: you will probably be eager to make everything perfect in the first couple of Apache installations you do; then, you will likely back off and find a balance among your security needs, the effort required to meet those needs, and available resources.
As a rule of thumb, if you are building a high profile web server—public or not—always go for a highly secure installation.
Though the purpose of this chapter is to be a comprehensive guide to Apache installation and configuration, you are encouraged to read others' approaches to Apache hardening as well. Every approach has its unique points, reflecting the personality of its authors. Besides, the opinions presented here are heavily influenced by the work of others. The Apache reference documentation is a resource you will go back to often. In addition to it, ensure you read the Apache Benchmark, which is a well-documented reference installation procedure that allows security to be quantified. It includes a semi-automated scoring tool to be used for assessment.
The following is a list of some of the most useful Apache installation documentation I have encountered:
  • Apache Online Documentation (http://httpd.apache.org/docs-2.0/)
  • Apache Security Tips (http://httpd.apache.org/docs-2.0/misc/security_tips.html)
  • Apache Benchmark (http://www.cisecurity.org/bench_apache.html)
  • "Securing Apache: Step-by-Step" by Artur Maj (
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Installation
The installation instructions given in this chapter are designed to apply to both active branches (1.x and 2.x) of the Apache web server running on Linux systems. If you are running some other flavor of Unix, I trust you will understand what the minimal differences between Linux and your system are. The configuration advice given in this chapter works well for non-Unix platforms (e.g., Windows) but the differences in the installation steps are more noticeable:
  • Windows does not offer the chroot functionality (see the section Section 2.4) or an equivalent.
  • You are unlikely to install Apache on Windows from source code. Instead, download the binaries from the main Apache web site.
  • Disk paths are different though the meaning is the same.
One of the first decisions you will make is whether to compile the server from the source or use a binary package. This is a good example of the dilemma I mentioned at the beginning of this chapter. There is no one correct decision for everyone or one correct decision for you alone. Consider some pros and cons of the different approaches:
  • By compiling from source, you are in the position to control everything. You can choose the compile-time options and the modules, and you can make changes to the source code. This process will consume a lot of your time, especially if you measure the time over the lifetime of the installation (it is the only correct way to measure time) and if you intend to use modules with frequent releases (e.g., PHP).
  • Installation and upgrade is a breeze when binary distributions are used now that many vendors have tools to have operating systems updated automatically. You exchange some control over the installation in return for not having to do everything yourself. However, this choice means you will have to wait for security patches or for the latest version of your favorite module. In fact, the latest version of Apache or your favorite module may never come since most vendors choose to use one version in a distribution and only issue patches to that version to fix potential problems. This is a standard practice, which vendors use to produce stable distributions.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Configuration and Hardening
Now that you know your installation works, make it more secure. Being brave, we start with an empty configuration file, and work our way up to a fully functional configuration. Starting with an empty configuration file is a good practice since it increases your understanding of how Apache works. Furthermore, the default configuration file is large, containing the directives for everything, including the modules you will never use. It is best to keep the configuration files nice, short, and tidy.
Start the configuration file (/usr/local/apache/conf/httpd.conf) with a few general-purpose directives:
# location of the web server files
ServerRoot /usr/local/apache
# location of the web server tree
DocumentRoot /var/www/htdocs
# path to the process ID (PID) file, which
# stores the PID of the main Apache process
PidFile /var/www/logs/httpd.pid
# which port to listen at
Listen 80
# do not resolve client IP addresses to names
HostNameLookups Off
Upon installation, Apache runs as a user nobody. While this is convenient (this account normally exists on all Unix operating systems), it is a good idea to create a separate account for each different task. The idea behind this is that if attackers break into the server through the web server, they will get the privileges of the web server. The intruders will have the same priveleges as in the user account. By having a separate account for the web server, we ensure the attackers do not get anything else free.
The most commonly used username for this account is httpd, and some people use apache. We will use the former. Your operating system may come pre-configured with an account for this purpose. If you like the name, use it; otherwise, delete it from the system (e.g., using the userdel tool) to avoid confusion later. To create a new account, execute the following two commands while running as root.
# groupadd httpd
# useradd httpd -g httpd -d /dev/null -s /sbin/nologin
            
These commands create a group and a user account, assigning the account the home directory /dev/null and the shell /sbin/nologin (effectively disabling login for the account). Add the following two lines to the Apache configuration file
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Changing Web Server Identity
One of the principles of web server hardening is hiding as much information from the public as possible. By extending the same logic, hiding the identity of the web server makes perfect sense. This subject has caused much controversy. Discussions usually start because Apache does not provide facilities to control all of the content provided in the Server header field, and some poor soul tries to influence Apache developers to add it. Because no clear technical reasons support either opinion, discussions continue.
I have mentioned the risks of providing server information in the Server response header field defined in the HTTP standard, so a first step in our effort to avoid this will be to fake its contents. As you will see later, this is often not straightforward, but it can be done. Suppose we try to be funny and replace our standard response "Apache/1.3.30 (Unix)" with "Microsoft-IIS/5.0" (it makes no difference to us that Internet Information Server has a worse security record than Apache; our goal is to hide who we are). An attacker sees this but sees no trace of Active Server Pages (ASP) on the server, and that makes him suspicious. He decides to employ operating system fingerprinting. This technique uses the variations in the implementations of the TCP/IP protocol to figure out which operating system is behind an IP address. This functionality comes with the popular network scanner NMAP. Running NMAP against a Linux server will sometimes reveal that the server is not running Windows. Microsoft IIS running on a Linux server—not likely!
There are also differences in the implementations of the HTTP protocol supplied by different web servers. HTTP fingerprinting exploits these differences to determine the make of the web server. The differences exist for the following reasons:
  • Standards do not define every aspect of protocols. Some parts of the standard are merely recommendations, and some parts are often intentionally left vague because no one at the time knew how to solve a particular problem so it was left to resolve itself.
  • Standards sometimes do not define trivial things.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Putting Apache in Jail
Even the most secure software installations get broken into. Sometimes, this is because you get the attention of a skilled and persistent attacker. Sometimes, a new vulnerability is discovered, and an attacker uses it before the server is patched. Once an intruder gets in, his next step is to look for local vulnerability and become superuser. When this happens, the whole system becomes contaminated, and the only solution is to reinstall everything.
Our aim is to contain the intrusion to just a part of the system, and we do this with the help of the chroot(2) system call. This system call allows restrictions to be put on a process, limiting its access to the filesystem. It works by choosing a folder to become the new filesystem root. Once the system call is executed, a process cannot go back (in most cases, and provided the jail was properly constructed).
The root user can almost always break out of jail. The key to building an escape-proof jail environment is not to allow any root processes to exist inside the jail. You must also not have a process outside jail running as the same user as a process inside jail. Under some circumstances, an attacker may jump from one process to another and break out of jail. That's one of the reasons why I have insisted on having a separate account for Apache.
The term chroot is often interchangeably used with the term jail. The term can be used as a verb and noun. If you say Apache is chrooted, for example, you are saying that Apache was put in jail, typically via use of the chroot binary or the chroot(2) system call. On Linux systems, the meanings of chroot and jail are close enough. BSD systems have a separate jail( ) call, which implements additional security mechanisms. For more details about the jail( ) call, see the following: http://docs.freebsd.org/44doc/papers/jail/jail.html.
Incorporating the jail mechanism (using either chroot(2) or jail( )) into your web server defense gives the following advantages:
Containment
If the intruder breaks in through the server, he will only be able to access files in the restricted file system. Unable to touch other files, he will be unable to alter them or harm the data in any way.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: PHP
PHP is the most popular web scripting language and an essential part of the Apache platform. Consequently, it is likely most web application installations will require PHP's presence. However, if your PHP needs are moderate, consider replacing the functionality you need using plain-old CGI scripts. The PHP module is a complex one and one that had many problems in the past.
This chapter will help you use PHP securely. In addition to the information provided here, you may find the following resources useful:
  • Security section of the PHP manual (http://www.php.net/manual/en/security.php)
  • PHP Security Consortium (http://www.phpsec.org)
In this section, I will present the installation and configuration procedures for two different options: using PHP as a module and using it as a CGI. Using PHP as a module is suitable for systems that are dedicated to a single purpose or for sites run by trusted groups of administrators and developers. Using PHP as a CGI (possibly with an execution wrapper) is a better option when users cannot be fully trusted, in spite of its worse performance. (Chapter 6 discusses running PHP over FastCGI which is an alternative approach that can, in some circumstances, provide the speed of the module combined with the privilege separation of a CGI.) To begin with the installation process, download the PHP source code from http://www.php.net.
When PHP is installed as a module, it becomes a part of Apache and performs all operations as the Apache user (usually httpd). The configuration process is similar to that of Apache itself. You need to prepare PHP source code for compilation by calling the configure script (in the directory where you unpacked the distribution), at a minimum letting it know where Apache's apxs tool resides. The apxs tool is used as the interface between Apache and third-party modules:
$ ./configure --with-apxs=/usr/local/apache/bin/apxs
$ make
# make install
            
Replace --with-apxs with --with-apxs2 if you are running Apache 2. If you plan to use PHP only from within the web server, it may be useful to put the installation together with Apache. Use the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Installation
In this section, I will present the installation and configuration procedures for two different options: using PHP as a module and using it as a CGI. Using PHP as a module is suitable for systems that are dedicated to a single purpose or for sites run by trusted groups of administrators and developers. Using PHP as a CGI (possibly with an execution wrapper) is a better option when users cannot be fully trusted, in spite of its worse performance. (Chapter 6 discusses running PHP over FastCGI which is an alternative approach that can, in some circumstances, provide the speed of the module combined with the privilege separation of a CGI.) To begin with the installation process, download the PHP source code from http://www.php.net.
When PHP is installed as a module, it becomes a part of Apache and performs all operations as the Apache user (usually httpd). The configuration process is similar to that of Apache itself. You need to prepare PHP source code for compilation by calling the configure script (in the directory where you unpacked the distribution), at a minimum letting it know where Apache's apxs tool resides. The apxs tool is used as the interface between Apache and third-party modules:
$ ./configure --with-apxs=/usr/local/apache/bin/apxs
$ make
# make install
            
Replace --with-apxs with --with-apxs2 if you are running Apache 2. If you plan to use PHP only from within the web server, it may be useful to put the installation together with Apache. Use the --prefix configuration parameter for that:
$ ./configure \
> --with-apxs=/usr/local/apache/bin/apxs \ 
> --prefix=/usr/local/apache/php
            
In addition to making PHP work with Apache, a command-line version of PHP will be compiled and copied to /usr/local/apache/php/bin/php. The command-line version is useful if you want to use PHP for general scripting, unrelated to web servers.
The following configuration data makes Apache load PHP when it starts and allows Apache to identify which pages contain PHP code:
# Load the PHP module (the module is in
# subdirectory modules/ in Apache 2)
LoadModule php5_module libexec/libphp5.so
# Activate the module (not needed with Apache 2)
AddModule mod_php5.c
   
# Associate file extensions with PHP
AddHandler application/x-httpd-php .php
AddHandler application/x-httpd-php .php3
AddHandler application/x-httpd-php .inc
AddHandler application/x-httpd-php .class
AddHandler application/x-httpd-php .module
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Configuration
Configuring PHP can be a time-consuming task since it offers a large number of configuration options. The distribution comes with a recommended configuration file php.ini-recommended, but I suggest that you just use this file as a starting point and create your own recommended configuration.
Working with PHP you will discover it is a powerful tool, often too powerful. It also has a history of loose default configuration options. Though the PHP core developers have paid more attention to security in recent years, PHP is still not as secure as it could be.

Section 3.2.1.1: register_globals and allow_url_fopen

One PHP configuration option strikes fear into the hearts of system administrators everywhere, and it is called register_globals. This option is off by default as of PHP 4.2.0, but I am mentioning it here because:
  • It is dangerous.
  • You will sometimes be in a position to audit an existing Apache installation, so you will want to look for this option.
  • Sooner or later, you will get a request from a user to turn it on. Do not do this.
I am sure it seemed like a great idea when people were not as aware of web security issues. This option, when enabled, automatically transforms request parameters directly into PHP global parameters. Suppose you had a URL with a name parameter:
http://www.apachesecurity.net/sayhello.php?name=Ivan
The PHP code to process the request could be this simple:
<? echo "Hello $name!"; ?>
With web programming being as easy as this, it is no wonder the popularity of PHP exploded. Unfortunately, this kind of functionality led to all sorts of unwanted side effects, which people discovered after writing tons of insecure code. Look at the following code fragment, placed on the top of an administration page:
<?
if (isset($admin) =  = false) {
    die "This page is for the administrator only!";
}
?>
In theory, the software would set the $admin variable to true when it authenticates the user and figures out the user has administration privileges. In practice, appending ?admin=1 to the URL would cause PHP to create the $admin variable where one is absent. And it gets worse.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Advanced PHP Hardening
When every little bit of additional security counts, you can resort to modifying PHP. In this section, I present two approaches: one that uses PHP extension capabilities to change its behavior without changing the source code, and another that goes all the way and modifies the PHP source code to add an additional security layer.
In PHP, S API stands for Server Abstraction Application Programming Interface and is a part of PHP that connects the engine with the environment it is running in. One SAPI is used when PHP is running as an Apache module, a second when running as a CGI script, and a third when running from the command line. Of interest to us are the three input callback hooks that allow changes to be made to the way PHP handles script input data:
input_filter
Called before each script parameter is added to the list of parameters. The hook is given an opportunity to modify the value of the parameter and to accept or refuse its addition to the list.
treat_data
Called to parse and transform script parameters from their raw format into individual parameters with names and values.
default_post_reader
Called to handle a POST request that does not have a handler associated with it.
The input_filter hook is the most useful of all three. A new implementation of this hook can be added through a custom PHP extension and registered with the engine using the sapi_register_input_filter( ) function. The PHP 5 distribution comes with an input filter example (the file README.input_filter also available at http://cvs.php.net/co.php/php-src/README.input_filter), which is designed to strip all HTML markup (using the strip_tags( ) function) from script parameters. You can use this file as a starting point for your own extension.
A similar solution can be implemented without resorting to writing native PHP extensions. Using the auto_prepend_file configuration option to prepend input sanitization code for every script that is executed will have similar results in most cases. However, only the direct, native-code approach works in the following situations:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: SSL and TLS
Like many other Internet protocols created before it, HTTP was designed under the assumption that data transmission would be secure. This is a perfectly valid assumption; it makes sense to put a separate communication layer in place to worry about issues such as confidentiality and data integrity. Unfortunately, a solution to secure data transmission was not offered at the same time as HTTP. It arrived years later, initially as a proprietary protocol.
By today's standards, the Internet was not a very secure place in the early days. It took us many years to put mechanisms in place for secure communication. Even today, millions of users are using insecure, plaintext communication protocols to transmit valuable, private, and confidential information.
Not taking steps to secure HTTP communication can lead to the following weaknesses:
  • Data transmission can be intercepted and recorded with relative ease.
  • For applications that require users to authenticate themselves, usernames and passwords are trivial to collect as they flow over the wire.
  • User sessions can be hijacked, and attackers can assume users' identities.
Since these are serious problems, the only cases where additional security measures are not required are with a web site where all areas are open to the public or with a web site that does not contain any information worth protecting. Some cases require protection:
  • When a web site needs to collect sensitive information from its users (e.g., credit card numbers), it must ensure the communication cannot be intercepted and the information hijacked.
  • The communication between internal web applications and intranets is easy to intercept since many users share common network infrastructure (for example, the local area network). Encryption (described later in the chapter) is the only way to ensure confidentiality.
  • Mission-critical web applications require a maximum level of security, making encryption a mandatory requirement.
To secure HTTP, the Secure Sockets Layer (SSL) protocol is used. This chapter begins by covering cryptography from a practical point of view. You only need to understand the basic principles. We do not need to go into mathematical details and discuss differences between algorithms for most real-life requirements. After documenting various types of encryption, this chapter will introduce SSL and describe how to use the OpenSSL libraries and the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Cryptography
Cryptography is a mathematical science used to secure storage and transmission of data. The process involves two steps: encryption transforms information into unreadable data, and decryption converts unreadable data back into a readable form. When cryptography was first used, confidentiality was achieved by keeping the transformation algorithms secret, but people figured out those algorithms. Today, algorithms are kept public and well documented, but they require a secret piece of information; a key, to hide and reveal data. Here are three terms you need to know:
Cleartext
Data in the original form; also referred to as plaintext
Cipher
The algorithm used to protect data
Ciphertext
Data in the encoded (unreadable) form
Cryptography aims to achieve four goals:
Confidentiality
Protect data from falling into the wrong hands
Authentication
Confirm identities of parties involved in communication
Integrity
Allow recipient to verify information was not modified while in transit
Nonrepudiation
Prevent sender from claiming information was never sent
The point of cryptography is to make it easy to hide (encrypt) information yet make it difficult and time consuming for anyone without the decryption key to decrypt encrypted information.
No one technique or algorithm can be used to achieve all the goals listed above. Instead, several concepts and techniques have to be combined to achieve the full effect. There are four important concepts to cover:
  • Symmetric encryption
  • Asymmetric encryption
  • One-way encryption
  • Digital certificates
Do not be intimidated by the large number of encryption methods in use. Mathematicians are always looking for better and faster methods, making the number constantly grow. You certainly do not need to be aware of the inner details of these algorithms to use them. You do, however, have to be aware of legal issues that accompany them:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
SSL
Around 1995, Netscape Navigator was dominating the browser market with around a 70 percent share. When Netscape created SSL in 1994, it became an instant standard. Microsoft tried to compete, releasing a technology equivalent, Private Communication Technology (PCT), but it had no chance due to Internet Explorer's small market share. It was not until 1996, when Microsoft released Internet Explorer 3, that Netscape's position was challenged.
The first commercial SSL implementation to be released was SSLv2, which appeared in 1994. Version 3 followed in 1995. Netscape also released the SSLv3 reference implementation and worked with the Internet Engineering Task Force (IETF) to turn SSL into a standard. The official name of the standard is Transport Layer Security (TLS), and it is defined in RFC 2246 (http://www.ietf.org/rfc/rfc2246.txt). TLS is currently at version 1.0, but that version is practically the same as SSLv3.1. In spite of the official standard having a different name everyone continues to call the technology SSL, so that is what I will do, too.
SSL lives above TCP and below HTTP in the Open Systems Interconnection (OSI) model, as illustrated in Figure 4-6. Though initially implemented to secure HTTP, SSL now secures many connection-oriented protocols. Examples are SMTP, POP, IMAP, and FTP.
Figure 4-6: SSL belongs to level 6 of the OSI model
In the early days, web hosting required exclusive use of one IP address per hosted web site. But soon hosting providers started running out of IP addresses as the number of web sites grew exponentially. To allow many web sites to share the same IP address, a concept called name-based virtual hosting was devised. When it is deployed, the name of the target web site is transported in the Host request header. However, SSL still requires one exclusive IP address per web site. Looking at the OSI model, it is easy to see why. The HTTP request is wrapped inside the encrypted channel, which can be decrypted with the correct server key. But without looking into the request, the web server cannot access the Host
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
OpenSSL
OpenSSL is the open source implementation (toolkit) of many cryptographic protocols. Almost all open source and many commercial packages rely on it for their cryptographic needs. OpenSSL is licensed under a BSD-like license, which allows commercial exploitation of the source code. You probably have OpenSSL installed on your computer if you are running a Unix system. If you are not running a Unix system or you are but you do not have OpenSSL installed, download the latest version from the web site (http://www.openssl.org). The installation is easy:
$ ./config
$ make
# make install
         
Do not download and install a new copy of OpenSSL if one is already installed on your system. You will find that other applications rely on the pre-installed version of OpenSSL. Adding another version on top will only lead to confusion and possible incompatibilities.
OpenSSL is a set of libraries, but it also includes a tool, openssl, which makes most of the functionality available from the command line. To avoid clutter, only one binary is used as a façade for many commands supported by OpenSSL. The first parameter to the binary is the name of the command to be executed.
The standard port for HTTP communication over SSL is port 443. To connect to a remote web server using SSL, type something like the following, where this example shows connecting to Thawte's web site:
$ openssl s_client -host www.thawte.com -port 443
         
As soon as the connection with the server is established, the command window is filled with a lot of information about the connection. Some of the information displayed on the screen is quite useful. Near the top is information about the certificate chain, as shown below. A certificate chain is a collection of certificates that make a path from the first point of contact (the web site www.thawte.com, in the example above) to a trusted root certificate. In this case, the chain references two certificates, as shown in the following output. For each certificate, the first line shows the information about the certificate itself, and the second line shows information about the certificate it was signed with. Certificate information is displayed in condensed format: the forward slash is a separator, and the uppercase letters stand for certificate fields (e.g.,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Apache and SSL
If you are using Apache from the 2.x branch, the support for SSL is included with the distribution. For Apache 1, it is a separate download of one of two implementations. You can use mod_ssl (http://www.modssl.org) or Apache-SSL (http://www.apache-ssl.org). Neither of these two web sites discusses why you would choose one instead of the other. Historically, mod_ssl was created out of Apache-SSL, but that was a long time ago and the two implementations have little in common (in terms of source code) now. The mod_ssl implementation made it into Apache 2 and is more widely used, so it makes sense to make it our choice here.
Neither of these implementations is a simple Apache module. The Apache 1 programming interface does not provide enough functionality to support SSL, so mod_ssl and Apache-SSL rely on modifying the Apache source code during installation.
To add SSL to Apache 1, download and unpack the mod_ssl distribution into the same top folder where the existing Apache source code resides. In my case, this is /usr/local/src. I will assume you are using Apache Version 1.3.31 and mod_ssl Version 2.8.19-1.3.31:
$ cd /usr/local/src
$ wget -q http://www.modssl.org/source/mod_ssl-2.8.19-1.3.31.tar.gz
$ tar zxvf mod_ssl-2.8.19-1.3.31.tar.gz
$ cd mod_ssl-2.8.19-1.3.31
$ ./configure --with-apache=../apache_1.3.31
            
Return to the Apache source directory (cd ../apache_1.3.31) and configure Apache, adding a --enable-module=ssl switch to the configure command. Proceed to compile and install Apache as usual:
$ ./configure --prefix=/usr/local/apache --enable-module=ssl
$ make
# make install
            
Adding SSL to Apache 2 is easier as you only need to add a --enable-ssl switch to the configure line. Again, recompile and reinstall. I advise you to look at the configuration generated by the installation (in httpd.conf for Apache 1 or ssl.conf for Apache 2) and familiarize yourself with the added configuration options. I will cover these options in the following sections.
Once SSL is enabled, the server will not start unless a private key and a certificate are properly configured. Private keys are commonly protected with passwords (also known as
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Setting Up a Certificate Authority
If you want to become a CA, everything you need is included in the OpenSSL toolkit. This step is only feasible in a few high-end cases in which security is critical and you need to be in full control of the process. T he utilities provided with OpenSSL will perform the required cryptographic computations and automatically track issued certificates using a simple, file-based database. To be honest, the process can be cryptic (no pun intended) and frustrating at times, but that is because experts tend to make applications for use by other experts. Besides, polishing applications is not nearly as challenging as inventing something new. Efforts are under way to provide more user-friendly and complete solutions. Two popular projects are:
OpenCA (http://www.openca.org/openca/)
Aims to be a robust out-of-the-box CA solution
TinyCA (http://tinyca.sm-zone.net)
Aims to serve only as an OpenSSL frontend
The most important part of CA operation is making sure the CA's private key remains private. If you are serious about your certificates, keep the CA files on a computer that is not connected to any network. You can use any old computer for this purpose. Remember to backup the files regularly.
After choosing a machine to run the CA operations on, remove the existing OpenSSL installation. Unlike what I suggested for web servers, for CA operation it is better to download the latest version of the OpenSSL toolkit from the main distribution site. The installation process is simple. You do not want the toolkit to integrate into the operating system (you may need to move it around later), so specify a new location for it. The following will configure, compile, and install the toolkit to /opt/openssl:
$ ./configure --prefix=/opt/openssl
$ make
$ make test
# make install
         
Included with the OpenSSL distribution is a convenience tool CA.pl (called CA.sh or CA in some distributions), which simplifies CA operations. The CA.pl tool was designed to perform a set of common operations with little variation as an alternative to knowing the OpenSSL commands by heart. This is particularly evident with the usage of default filenames, designed to be able to transition seamlessly from one step (e.g., generate a CSR) to another (e.g., sign the CSR).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Performance Considerations
SSL has a reputation for being slow. This reputation originated in its early days when it was slow compared to the processing power of computers. Things have improved. Unless you are in charge of a very large web installation, I doubt you will experience performance problems with SSL.
Since OpenSSL comes with a benchmark script, we do not have to guess how fast the cryptographic functions SSL requires are. The script will run a series of computing-intensive tests and display the results. Execute the script via the following:
$ openssl speed
            
The following results were obtained from running the script on a machine with two 2.8 GHz Pentium 4 Xeon processors. The benchmark uses only one processor for its measurements. In real-life situations, both processors will be used; therefore, the processing capacity on a dual server will be twice as large.
The following are the benchmark results of one-way and symmetrical algorithms:
type          16 bytes    64 bytes   256 bytes  1024 bytes  8192 bytes
md2            1841.78k    3965.80k    5464.83k    5947.39k    6223.19k
md4           17326.58k   55490.11k  138188.97k  211403.09k  263528.45k
md5           12795.17k   41788.59k  117776.81k  234883.07k  332759.04k
hmac(md5)      8847.31k   32256.23k  101450.50k  217330.69k  320913.41k
sha1           9529.72k   29872.66k   75258.54k  117943.64k  141710.68k
rmd160        10551.10k   31148.82k   62616.23k  116250.38k  101944.89k
rc4           90858.18k  102016.45k  104585.22k  105199.27k  105250.82k
des cbc       45279.25k   47156.76k   47537.41k   47827.29k   47950.51k
des ede3      17932.17k   18639.27k   18866.43k   18930.35k   18945.37k
rc2 cbc       11813.34k   12087.81k   12000.34k   12156.25k   12113.24k
blowfish cbc  80290.79k   83618.41k   84170.92k   84815.87k   84093.61k
cast cbc      30767.63k   32477.40k   32840.53k   32925.35k   32863.57k
aes-128 cbc   51152.56k   52996.52k   54039.55k   54286.68k   53947.05k
aes-192 cbc   45540.74k   46613.01k   47561.56k   47818.41k   47396.18k
aes-256 cbc   40427.22k   41204.46k   42097.83k   42277.21k   42125.99k
Looking at the first column of results for RC4 (a widely used algorithm today), you can see that it offers a processing speed of 90 MBps, and that is using one processor. This is so fast that it is unlikely to create a processing bottleneck.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Denial of Service Attacks
A denial of service (DoS) attack is an attempt to prevent legitimate users from using a service. This is usually done by consuming all of a resource used to provide the service. The resource targeted is typically one of the following:
  • CPU
  • Operating memory (RAM)
  • Bandwidth
  • Disk space
Sometimes, a less obvious resource is targeted. Many applications have fixed length internal structures and if an attacker can find a way to populate all of them quickly, the application can become unresponsive. A good example is the maximum number of Apache processes that can exist at any one time. Once the maximum is reached, new clients will be queued and not served.
DoS attacks are not unique to the digital world. They existed many years before anything digital was created. For example, someone sticking a piece of chewing gum into the coin slot of a vending machine prevents thirsty people from using the machine to fetch a refreshing drink.
In the digital world, DoS attacks can be acts of vandalism, too. They are performed for fun, pleasure, or even financial gain. In general, DoS attacks are a tough problem to solve because the Internet was designed on a principle that everyone plays by the rules.
You can become a victim of a DoS attack for various reasons:
Bad luck
In the worst case, you may be at the wrong place at the wrong time. Someone may think your web site is a good choice for an attack, or it may simply be the first web site that comes to mind. He may decide he does not like you personally and choose to make your life more troubled. (This is what happened to Steve Gibson, of http://www.grc.com fame, when a 13-year-old felt offended by the "script kiddies" term he used.)
Controversial content
Some may choose to attack you because they do not agree with the content you are providing. Many people believe disrupting your operation is acceptable in a fight for their cause. Controversial subjects such as the right to choose, globalization, and politics are likely to attract their attention and likely to cause them to act.
Unfair competition
In a fiercely competitive market, you may end up against competitors who will do anything to win. They may constantly do small things that slow you down or go as far as to pay someone to attack your resources.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Network Attacks
Network attacks are the most popular type of attack because they are easy to execute (automated tools are available) and difficult to defend against. Since these attacks are not specific to Apache, they fall outside the scope of this book and thus they are not covered in detail in the following sections. As a rule of thumb, only your upstream provider can defend you from attacks performed on the network level. At the very least you will want your provider to cut off the attacks at their routers so you do not have to pay for the bandwidth incurred by the attacks.
The simplest network attacks target weaknesses in implementations of the TCP/IP protocol. Some implementations are not good at handling error conditions and cause systems to crash or freeze. Some examples of this type of attack are:
  • Sending very large Internet Control Message Protocol (ICMP) packets. This type of attack, known as the Ping of death, caused crashes on some older Windows systems.
  • Setting invalid flags on TCP/IP packets.
  • Setting the destination and the source IP addresses of a TCP packet to the address of the attack target (Land attack).
These types of attacks have only historical significance, since most TCP/IP implementations are no longer vulnerable.
In the simplest form, an effective network attack can be performed from a single host with a fast Internet connection against a host with a slower Internet connection. By using brute force, sending large numbers of traffic packets creates a flood attack and disrupts target host operations. The concept is illustrated in Figure 5-1.
Figure 5-1: Brute-force DoS attack
At the same time, this type of attack is the easiest to defend against. All you need to do is to examine the incoming traffic (e.g., using a packet sniffer like tcpdump), discover the IP address from which the traffic is coming from, and instruct your upstream provider to block the address at their router.
At first glance, you may want to block the attacker's IP address on your own firewall but that will not help. The purpose of this type of attack is to saturate the Internet connection. By the time a packet reaches your router (or server), it has done its job.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Self-Inflicted Attacks
Administrators often have only themselves to blame for service failure. Leaving a service configured with default installation parameters is asking for trouble. Such systems are very susceptible to DoS attacks and a simple traffic spike can imbalance them.
One thing to watch for with Apache is memory usage. Assuming Apache is running in prefork mode, each request is handled by a separate process. To serve one hundred requests at one time, a hundred processes are needed. The maximum number of processes Apache can create is controlled with the MaxClients directive, which is set to 256 by default. This default value is often used in production and that can cause problems if the server cannot cope with that many processes.
Figuring out the maximum number of Apache processes a server can accommodate is surprisingly difficult. On a Unix system, you cannot obtain precise figures on memory utilization. The best thing we can do is to use the information we have, make assumptions, and then simulate traffic to correct memory utilization issues.
Looking at the output of the ps command, we can see how much memory a single process takes (look at the RSZ column as it shows the amount of physical memory in use by a process):
# ps -A -o pid,vsz,rsz,command
  PID   VSZ  RSZ COMMAND
 3587  9580 3184 /usr/local/apache/bin/httpd
 3588  9580 3188 /usr/local/apache/bin/httpd
 3589  9580 3188 /usr/local/apache/bin/httpd
 3590  9580 3188 /usr/local/apache/bin/httpd
 3591  9580 3188 /usr/local/apache/bin/httpd
 3592  9580 3188 /usr/local/apache/bin/httpd
In this example, each Apache instance takes 3.2 MB. Assuming the default Apache configuration is in place, this server requires 1 GB of RAM to reach the peak capacity of serving 256 requests in parallel, and this is only assuming additional memory for CGI scripts and dynamic pages will not be required.
Most web servers do not operate at the edge of their capacity. Your initial goal is to limit the number of processes to prevent server crashes. If you set the maximum number of processes to a value that does not make full use of the available memory, you can always change it later when the need for more processes appears.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Traffic Spikes
A sudden spike in the web server traffic can have the same effect as a DoS attack. A well-configured server will cope with the demand, possibly slowing down a little or refusing some clients. If the server is not configured properly, it may crash.
Traffic spikes occur for many reasons, and some of them may be normal. A significant event will cause people to log on and search for more information on the subject. If a site often takes a beating in spite of being properly configured, perhaps it is time to upgrade the server or the Internet connection.
The following sections describe the causes and potential solutions for traffic spikes.
If you have processing power to spare but not enough bandwidth, you might exchange one for the other, making it possible to better handle traffic spikes. Most modern browsers support content compression automatically: pages are compressed before they leave the server and decompressed after they arrive at the client. The server will know the client supports compression when it receives a request header such as this one:
Accept-Encoding: gzip,deflate
Content compression makes sense when you want to save the bandwidth, and when the clients have slow Internet connections. A 40-KB page may take eight seconds to download over a modem. If it takes the server a fraction of a second to compress the page to 15 KB (good compression ratios are common with HTML pages), the 25-KB length difference will result in a five-second acceleration. On the other hand, if your clients have fast connection speeds (e.g., on local networks), there will be no significant download time reduction.
For Apache 1, mod_gzip (http://www.schroepl.net/projekte/mod_gzip/) is used for content compression. For Apache 2, mod_deflate does the same and is distributed with the server. However, compression does not have to be implemented on the web server level. It can work just as well in the application server (e.g., PHP; see http://www.php.net/zlib) or in the application.
Bandwidth stealing (also known as hotlinking) is a common problem on the Internet. It refers to the practice of rogue sites linking directly to files (often images) residing on other sites (victims). To users, it looks like the files are being provided by the rogue site, while the owner of the victim site is paying for the bandwidth.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Attacks on Apache
Content preview·Buy PDF of this chapter|