Buy this Book
Print Book $49.99 PDF $34.99 Read it Now!
Print Book £30.99
Add to UK Cart
Reprint Licensing

Securing Ajax Applications
Securing Ajax Applications Ensuring the Safety of the Dynamic Web

By Christopher Wells
Book Price: $49.99 USD
£30.99 GBP
PDF Price: $34.99

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: The Evolving Web
People are flocking to the Web more than ever before, and this growth is being driven by applications that employ the ideas of sharing and collaboration. Web sites such as Google Maps, MySpace, Yahoo!, Digg, and others are introducing users to new social and interactive features, to seeding communities, and to collecting and reusing all sorts of precious data.
The slate has been wiped clean and the stage set for a new breed of web application. Everything old is new again. Relationships fuel this new Web. And service providers, such as Yahoo!, Google, and Microsoft, are all rushing to expose their wares. It's like a carnival! Everything is open. Everything is free—at least for now. But whom can you trust?
Though mesmerized by the possibilities, as developers, we must remain vigilant—for the sakes of our users. For us, it is critical to recognize that the fundamentals of web programming have not changed. What has changed is this notion of "opening" resources and data so that others might use that data in new and creative ways. Furthermore, with all this sharing going on we can't let ourselves forget that our applications must still defend themselves.
As technology moves forward, and we find our applications becoming more interactive—sharing data between themselves and other sites—it raises a host of new security concerns. Our applications might consist of services provided by multiple providers (sites) each hosting its own piece of the application.
The surface area of these applications grows too. There are more points to watch and guard against—expanding both with technologies such as AJAX on the client and REST or Web Services on the server.
Luckily, we are not left completely empty-handed. Web security is not new. There are some effective techniques and best practices that we can apply to these new applications.
Today, web programming languages make it easy to build applications without having to worry about the underlying plumbing. The details of connection and protocol have been abstracted away. In doing so developers have grown complacent with their environments and in some cases are even more vulnerable to attack.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Rise of the Web
In 1989, at a Conseil Européen pour la Recherche Nucléaire (CERN) research facility in Switzerland, a researcher by the name of Tim Berners-Lee and his team cooked up a program and protocol to facilitate the sharing and communication of their particle physics research. The idea of this new program was to be able to "link" different types of research documents together.
What Berners-Lee and the others created was the start of a new protocol, Hypertext Transfer Protocol (HTTP), and a new markup language, Hypertext Markup Language (HTML). Together they make up the World Wide Web (WWW).
The abstract of the original request for comment (RFC 1945) reads:
The Hypertext Transfer Protocol (HTTP) is an application-level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods (commands). A feature of HTTP is the typing of data representation, allowing systems to be built independently of the data being transferred.
HTTP has been in use by the World-Wide Web global information initiative since 1990. This specification reflects common usage of the protocol referred to as "HTTP/1.0".
The official RFC outlines everything there is to say about HTTP and is located at . If you have any trouble sleeping at night, reading this might help you out.
Berners-Lee had set out to create a way to collate his research documents—to keep things just one click away. It was really just about information and data organization; little did he know he was creating the foundation for today's commerce.
Today, we don't even see HTTP unless we want to deliberately. It has, for the most part, been abstracted away from us. Yet, it is at the very heart of our applications.
There's this guy—let's call him Jim. He's an old-timer who can spin yarns about the first time he ever sat down at a PDP-11. He still has his first programs saved on paper tape and punch cards. He's one of the first developers who helped to create the Internet that we have come to know and love.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Web Security
describes where the Web came from and how it works. It is important to remember that the modern Web is built on a series of software abstractions and that we still need to know the basic protocol and infrastructure to build reliable and secure applications.
This chapter takes a closer look at how security works and how it applies to web applications. If your application is on the Internet, it is on the front lines of your network. It is like a door to the outside world that allows visitors to come in and check out whatever you have to offer. Your application needs to be secure, and you need to be aware of the dangers an application can open to your network.
Imagine a security guard walking through the dimly lit corridors of an office building late at night. As she enters each room, she shines her flashlight into every corner, scans for anything out of the ordinary, and then turns out the light and locks the door behind her. She follows this routine nightly and ensures that the office is safe and secure.
Well, web applications don't have security guards to protect them, by default. There is no enforcer to beat the living bytes out of would-be attackers.
So what can we do? Well, the first thing developers can do is recognize that we need to build security into our applications. We need to step up and do something about it ourselves. The next thing we must do is ascertain what we are actually protecting. Where does our application begin and end? What is its surface area? If our application is like most web applications, it is composed of three basic elements that I will describe next.

Section 2.1.1.1: Expect the unexpected

Boo! Attackers try to break things. They use applications in unexpected ways to generate faults and other conditions that could benefit them. Security concerns almost always arise from a condition that nobody expected.
Remember that night security guard. She's patrolling through the building looking for things out of the ordinary. She knows that if something is out of place, someone or something caused that condition. Of course, a smart attacker just waits until the guard has checked all the rooms
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Security Basics
Imagine a security guard walking through the dimly lit corridors of an office building late at night. As she enters each room, she shines her flashlight into every corner, scans for anything out of the ordinary, and then turns out the light and locks the door behind her. She follows this routine nightly and ensures that the office is safe and secure.
Well, web applications don't have security guards to protect them, by default. There is no enforcer to beat the living bytes out of would-be attackers.
So what can we do? Well, the first thing developers can do is recognize that we need to build security into our applications. We need to step up and do something about it ourselves. The next thing we must do is ascertain what we are actually protecting. Where does our application begin and end? What is its surface area? If our application is like most web applications, it is composed of three basic elements that I will describe next.

Section 2.1.1.1: Expect the unexpected

Boo! Attackers try to break things. They use applications in unexpected ways to generate faults and other conditions that could benefit them. Security concerns almost always arise from a condition that nobody expected.
Remember that night security guard. She's patrolling through the building looking for things out of the ordinary. She knows that if something is out of place, someone or something caused that condition. Of course, a smart attacker just waits until the guard has checked all the rooms before attacking.

Section 2.1.1.2: Subjects

Subjects use the application. The most common subjects are usually regular users (people), but subjects could also be other programs calling via Web Services or some other external API. Either way, subjects are always external entities that call the system.
Let's say we have a web site that sells widgets. It implements a typical shopping cart and a web service.
In the case of our application we have two different types of subjects:
Customers
People who come to the site to buy our products using the shopping cart
Partners
Programs that use the web service to manage products as part of a larger federated application
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Risk Analysis
What if something goes wrong? We need a plan. We need to know what to do if we are attacked. We need to know how we can be attacked and the likelihood of an attack. A good process for answering these questions is to develop a threat model for the application.
How do we evaluate the security of an application? Well, first we have to identify what a web application is.
Web applications potentially connect users anywhere on the planet to your database. On one end these applications face the Internet and process incoming HTTP requests and responses. On the other end they connect to all of the goodies: files, system resources, and data. Because these applications provide access to backend resources they need to be looked at more critically.

Section 2.2.1.1: Entry points

Entry points are locations in the application that data can enter the system. Data entering the system needs validation. If the data is not validated or inspected before use, it should be considered tainted.
Applications rely on valid data to execute correctly. If tainted data enters the system the application could inadvertently display that data to the user. Likewise, the system could halt or throw an exception thereby revealing information about the application. Attackers look for these types of conditions and exploit them.
Data can enter the application from all sorts of places:
  • User input
  • Files
  • Sockets
  • System properties
  • Named pipes
  • Programmatic interface
  • Registry
  • Email
  • Command-line arguments
  • Initialization parameters
  • Environmental variables
  • Database
It is important to look at each of these entry points and determine the types of data entering and how the data is used in the application.

Section 2.2.1.2: Trust level

Trust level is the assigned trust you give an external entity by way of a role to access a particular entry point. For example, an Administrator role is a privileged role with a high trust level that is assigned more permissions than an ordinary user.
Users of an application should be assigned roles that determine whether they can do a particular operation. By segregating the operations of the application into different roles, you make it harder for one user to possess too much control over the system.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Common Web Application Vulnerabilities
Sometimes the easiest way to find vulnerabilities is to look at what has happened in the past. By examining common vulnerabilities that have appeared in other applications, we can learn from previous mistakes.
The Open Web Application Security Project (OWASP) is an open community dedicated to enabling organizations to develop, purchase, and maintain applications that can be trusted.
OWASP has tools, documents, forums, and local chapters all dedicated to the advancement of web application security. All the resources are free and open to anyone interested in improving application security.
OWASP advocates approaching application security as a people, process, and technology problem because the most effective approaches to application security include improvements in all these areas.
If you have not been there, check out the OWASP web site at .

Section 2.3.1.1: OWASP top 10

OWASP compiled a list of the top 10 vulnerabilities that plague web applications. This list is quickly becoming the de facto list of application vulnerabilities in security circles, and so here it is:
Unvalidated input
Information from web requests is not validated before being used by a web application. Attackers can use these flaws to attack backend components through a web application.
Broken access control
Restrictions on what authenticated users are allowed to do are not properly enforced. Attackers can exploit these flaws to access other users' accounts, view sensitive files, or use unauthorized functions.
Broken authentication and session management
Account credentials and session tokens are not properly protected. Attackers that can compromise passwords, keys, session cookies, or other tokens can defeat authentication restrictions and assume other users' identities.
Cross-site scripting
The web application can be used as a mechanism to transport an attack to an end user's browser. A successful attack can disclose the end user's session token, attack the local machine, or spoof content to fool the user.
Buffer overflow
Web application components in some languages that do not properly validate input can be crashed and, in some cases, used to take control of a process. These components can include CGI, libraries, drivers, and web application server components.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Securing Web Technologies
I'm not going to lie to you. Security is hard. Securing all these different web technologies is hard. Making sure the right people are using the correct functions is hard. Making sure you've got the right people—in the first place—is hard. Validating input, protecting confidential data, stopping the system from breaking in insecure ways are all hard. In fact, everything about this is hard—sorry about that.
Developers, especially Ajax-wielding, neo-energy-drink-guzzling Web 2.0 developers don't like hard things. So, we have a problem here. What's worse is that ignoring security makes innovation easier. This web stuff works even when it's not secure.
Developers often don't think about how their code is going to break. They don't think about how the network might break thereby causing the application to break. They don't think about how to craft input in a manner that will cause the system to break or do something unexpected—hackers do.
This is why I drink coffee. But seriously, if you do anything at all in regards to securing your applications, it is better than doing nothing—defense in depth, you know. Remember, it's not easy, but we're all in this together, and I'm pulling for you.
In this chapter, I show how web sites communicate, and then explain the variety of technologies commonly used in web applications and their various security impact. Let's start by taking a look at how web sites communicate.
The Web is an incredibly versatile platform for communication. Many interactions can take place before a web page is rendered. Clients can talk to servers, as in the case of someone with a web browser surfing the Internet. Servers can talk to other servers, such as when a web server dynamically polls or reuses content from one web site and displays it in another. And domains can talk to other domains, passing data between one another, or actively participate in the user's session as part of a larger more federated application. Cool! To see the security issues relating to each of these communications we need to look more closely at each type of interaction.
This is the Web as its creators intended. A browser asks for a file, the server responds with a file. On the server, you can pile up all your research documents—notes, sketches, white papers, references, and so on—link them all together, and share them with the team! Let's not go crazy here. We're just talking about sharing files, just a little bit of light reading for the team.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
How Web Sites Communicate
The Web is an incredibly versatile platform for communication. Many interactions can take place before a web page is rendered. Clients can talk to servers, as in the case of someone with a web browser surfing the Internet. Servers can talk to other servers, such as when a web server dynamically polls or reuses content from one web site and displays it in another. And domains can talk to other domains, passing data between one another, or actively participate in the user's session as part of a larger more federated application. Cool! To see the security issues relating to each of these communications we need to look more closely at each type of interaction.
This is the Web as its creators intended. A browser asks for a file, the server responds with a file. On the server, you can pile up all your research documents—notes, sketches, white papers, references, and so on—link them all together, and share them with the team! Let's not go crazy here. We're just talking about sharing files, just a little bit of light reading for the team.
And that is where things would have stayed if it were not for those kids and their meddling browser (Netscape) and the hopes of e-commerce.
The static Web was mostly fine from a security standpoint. I mean, the greatest harm that might come from a static web page is probably its content. But, the minute people started carrying on conversations using the Web things began to break down.
By conversations I mean that both the client and the server are supposed to remember the last transaction and potentially build on it. As we discussed in , the server has absolutely no way of reliably knowing what is happening on the client, and each transaction is stateless, so remembering prior transactions is tricky business.
Maybe you have an old mainframe application you want to put a new face on. Maybe two departments within a company want to share data and create a combined web site. The idea of reusable content—taking data found on one application and using it in another—isn't new. Back in the old days developers had to code up hill both ways by resorting to barbaric methods such as screen scraping to perform this sort of reuse.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Browser Security
Do we care about browser security? I mean it's the client, the user's browser. Unless the user is you, you probably don't have a lot of control over this environment in the first place. So, who cares, right?
A couple of years ago I might have agreed. But with new web technologies and techniques such as Ajax and Flash pushing more responsibility onto the client, the browser can no longer be totally ignored.
The design contract between the user and a web page is changing. How do users know when the page is loaded if the browser's "loading" icon doesn't stop spinning? Rather than a simple request-response model, the page now can make micro requests, moving some session state to the browser. The browser is now a first-class citizen in the application's data flow, and we have to start thinking about it differently.
Each page now plays a major role in the application, and in some ways the page is the application. Therefore, we need to care more about what technologies are running out on the browser and how best to help secure that environment. Developers are forced to think more about what is happening on the client and react accordingly.
At some point it becomes important to care about the security of the browser. After all, your users are using browsers, and if your application is running code in the browser, it should be secure. You may not be able to control everything out there, but if you do even a little to help educate your users, the Internet can be a safer place.
Some common security questions that we should ask while developing applications that are involved with or rely on the client are ones such as:
Is the client authenticated?
Is the channel with the client secure?
Is the client sending us data?
How is that data validated?
Does the browser have any data persisted locally?
Is that data confidential?
Does the user have a session?
To answer these questions and evaluate all the different web technologies together, we need a system for commonly identifying risk.
I like the STRIDE model originally coined by Microsoft. STRIDE was created by Microsoft to categorize different threat types to an application and stands for:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Browser Plug-ins, Extensions, and Add-ons
A plug-in is a piece of component application that extends the functionality of the host program. In the case of a web browser, plug-ins are available to add programmatic function, ActiveX controls, Java applets, Flash movies, and much more. Let's take a look at some of the more common of these technologies and the security issues that accompany them.
ActiveX controls are downloadable web components that run inside the Microsoft Internet Explorer web browser. ActiveX controls can be written in a variety of programming languages, including C, C++, C#, Visual Basic, and Java (J#), but are limited to the Windows operating system and Microsoft Internet Explorer.
In the 1990s, Microsoft had been working on Object Linking and Embedding (OLE) but OLE just didn't sound sexy enough, so Microsoft renamed the technology ActiveX.
Back then, Microsoft thought this new, active technology was sure to win over web developers. It allowed unprecedented access into the Windows operating system and helped push the notion of component development into reality. Unfortunately Microsoft was not thinking about security. It was trying to get everything and everyone talking to each other—using Windows.
ActiveX is similar to Java applets in that it is downloaded and executed within the browser. Users have to grant the controls explicit permission to run, but once granted, ActiveX controls have a rich set of APIs to work with within the Windows operating system. ActiveX controls are native code that run with the full set of permissions granted to the user. Although incredibly powerful, they are also incredibly dangerous.
is a "Hello World" application using an ActiveX control.
Figure 3-5: MS Agent ActiveX control "Hello World"
In this example I use ActiveX controls created by Microsoft called MS Agent. MS Agent is technology that provides API control over onscreen characters. I chose Peedy the Parrot for this example, but there are several more to choose from. demonstrates the use of ActiveX.
Example 3-7. A demonstration of an ActiveX control
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Protecting the Server
So, you want to run a web server in your basement to create the next big thing, and you're looking for some cheap security advice on how to get started? Well, my first and best suggestion is don't do it. I'm just saying if NASA—you know, rocket scientists—can't keep hackers out of its web servers, what makes you think you can? Go find some ISP that has the services you are looking for, and pay the ISP to do it. The job of administering a web server on your own can consume every waking moment, and unless you don't ever want to leave the house, it is well worth the money to let the pros handle the frontend work.
Are you really still reading? Picture this: you find that perfect somebody. You plan a romantic evening and go out to a movie and have a nice dinner. Just when things start to get interesting your phone trumpets out the cavalry charge ring tone informing you of 15 unauthorized login attempts on the web server. After apologizing to those around you for disrupting their dinner, your date raises an eyebrow and decides to skip dessert.
Still there, eh? I'm sorry. I know, it must sound glamorous to have your very own web server, but unless you have spent time thinking like a hacker, odds are whatever you put on the Internet will be vulnerable to attack.
Ajax applications require a web server to work. After all, what good is the XMLHttpRequest object without a web server to talk to on the backend. So, Ajax Security starts with the web server. If your web server is not secure, neither is your application. You need to know what role the web server plays in security. Securing a web server is a non-trivial task that requires an understanding of the web server's relationship with the network. By being aware of what security measures are on the web server, you can balance the security necessary within your applications. In this chapter, I will look at how to ensure the network is secure, and then go through the steps for making a secure and dynamite web server. I will also address what to do in the event of an attack.
See that funny-looking telephone-like cable coming out of your DSL/cable modem? That's the Internet. Before we can set up a web server, we must first prepare the network. You don't want to plug the web server into the Internet with a giant
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Network Security
See that funny-looking telephone-like cable coming out of your DSL/cable modem? That's the Internet. Before we can set up a web server, we must first prepare the network. You don't want to plug the web server into the Internet with a giant Hack Me sign on it, do you? We must take some precautions first.
What we really need to do is separate us from them, right? Us being—you know—us, and them being—well—the bad guys. We need a wall—make that a firewall—to keep them out.
A firewall is a device sitting between a private network and a public network. Part of what helps make a private network private is, in fact, the firewall. The firewall's job is to control traffic between computer networks with different zones of trust—for example, an internal, trusted zone, such as a private network, and an external, nontrusted zone, such as the Internet.

Section 4.1.1.1: Trust boundaries

Different trust zones meet in what is known as trust boundaries. It is like a seam in the network and, as mentioned earlier, seams require added security attention. We need to make sure that all the gaps are filled and that the firewall allows the right kind of traffic. We do this with firewall rules. Firewall rules establish a security policy governing what traffic is allowed to flow through the firewall and in what direction.
The ultimate goal is to provide a controlled interface between the different trust zones and enforce common security policy on the traffic that flows between them based on the following security principles:
Principle of least privilege
A user should be allowed to do only what she is required to do.
Separation of duties
Define roles for users and assign different levels of access control. Control how the application is developed, tested, and deployed and who has access to application data.
Firewalls are good at making quick decisions about whether one machine should be allowed to talk to another. The easiest way for the firewall to do this is to base its decisions on source address and destination address.

Section 4.1.1.2: Security concerns

Hey, what's this rule for? Far too often firewalls are found with rules that nobody remembers adding. This happens because administrators fear something will break if they remove them. When firewall rules are introduced, there should be a well-defined procedure for keeping track of each rule and its purpose.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Host Security
Image your web server as a gladiator about to go into battle. If it's going to have any chance of survival it must be battle ready. Basically, you want something more like Russell Crowe and less like Mel Brooks.
Additionally, the server should be hardened as though there were no firewall on the network. Firewalls, such as in the case of port 80, are not a silver bullet. Servers behind firewalls can still be compromised. So, each server needs to look after and take care of itself.
In the following section I am going to build a secure server using a distribution of Linux called Ubuntu Server Edition. However, most, if not all, of these concepts can be applied equally to other operating systems.
Ubuntu comes from an African word, meaning humanity to others. The Ubuntu distribution of Linux brings the spirit of Ubuntu to the software world.
Built on a branch of the Debian distribution of Linux—known for its robust server installations and glacial release cycle—the Ubuntu Server has a strong heritage for reliable performance and predictable evolution. The first Ubuntu release with a separate server edition was 5.10, in October 2005. shows the bootup screen for the Ubuntu server installation disk.
Figure 4-2: The Ubuntu installation screen
A key lesson from the Debian heritage is that of security by default. The Ubuntu Server has no open ports after installation and contains only the essential software needed to build a secure server. This makes for an ideal place to start when thinking about building a web server.

Section 4.2.1.1: Automatic LAMP

Additionally, in about 15 minutes, the time it takes to install Ubuntu Server Edition, you can have a LAMP (Linux, Apache, MySQL, and PHP) server up and ready to go.
When booting off the Ubuntu installation disk you are presented with the option to install a LAMP server. This option saves all the time and trouble associated with integrating Linux, Apache, MySQL, and PHP. Ubuntu integrates these things for you with security and ease of deployment in mind.
If you want to follow along with me, you may download and install the Ubuntu Server Edition from . There is also an excellent tutorial available online at .
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Web Server Hardening
Now that we have a secure, stable, bastionized host to begin with we can look at the web server itself. First, you are going to have to decide which web server to use. Ubuntu came with Apache2—at least that is what was installed after I chose the install LAMP option—so, I am going to start there. But several web servers are available, some part of larger frameworks like application servers.
The following are some general guidelines to protecting web servers/traffic:
  • Run SSL. Probably one of the best security things you could do is invest in a digital certificate () for your web server. In an age where Internet attacks are on the rise, it is hard to tell a secure site from an insecure one. SSL goes a long way toward solving that problem.
  • Require that all cookies going to the client are marked secure.
  • Authenticate users before initiating sessions.
  • Do server monitoring.
  • Read the logs.
  • Validate fire integrity.
  • Review web application for software flaws and vulnerabilities.
  • Consider running web applications behind a web proxy server, which prevents requests from directly accessing the application. This creates a place where content filtering can be done before data reaches the application.
Now, let's look at the specific web servers and see what we can do to secure them.
The Apache HTTP Server is the most popular web server on the Internet, which helps explain why it comes as the default web server on so many systems. The Apache HTTP Server Project is an effort to develop and maintain an open source HTTP server for modern operating systems including Unix and Windows. The goal of this project is to provide a secure, efficient, and extensible server that provides HTTP services in sync with the current HTTP standards.
The following is a set of hardening guidelines for securing Apache:
  1. The Apache process should run as its own user and not root.
  2. Establish a group for web administration and allow that group to read/write configuration files and read the Apache log files:
    groupadd webadmin
    chgrp -R webadmin /etc/apache2
    chgrp -R webadmin /var/apache2
    chmod -R g+rw /etc/apache2
    chmod -R g+r /var/log/apache2
    usermod -G webadmin user1,user2
    
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Application Server Hardening
Like web servers, application servers are flexible in their configuration. This flexibility allows them to be integrated into diverse environments. However, in many cases the out-of-the-box installation will not be hardened for Internet usage. Steps need to be taken to configure these servers so that they are secure. The following are some hardening guidelines for application servers.
The following are hardening recommendations for all next generation web application servers, but particularly for Java and .NET servers.

Section 4.4.1.1: Hardening guidelines

  1. Run all applications over SSL.
  2. Do no rely on client-side validation. Make input validation decisions on the server.
  3. Use the HttpOnly cookie option to help protect against cross-site scripting.
  4. Plan how authentication and access controls work before implementation.
  5. Employ role-base authorization checks for resources such as pages and directories.
  6. Divide the file structure of the site into public and restricted areas and provide proper authentication and access controls to restricted areas.
  7. Validate all input for type, length, and format. Employ positive validation and check for known acceptable data before filtering for bad data.
  8. Handle exceptions securely by not providing debug or infrastructure details as part of the exception.
  9. Use absolute URLs when sites contain secure and unsecure items.
  10. Ensure parameters used in SQL statements or data access codes are validated for length and type of data to help prevent SQL injection.
  11. Mark cookies as "secure." Restrict authentication cookies by requiring the use of the secure cookie property.
  12. Ensure authentication cookies are not persisted or logged.
  13. Make sure cookies have unique path/name combinations.
  14. Personalization cookies are separate from authentication cookies.
  15. Require error-directives or error pages for all web applications.
  16. Strong password policies are implemented for authentication.
  17. Define a low session timeout (15 minutes).
  18. Avoid generic server resource mappings such as wildcards (/*.do).
  19. Protect resources by storing them under the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: A Weak Foundation
When the Web was created everyone trusted each other, mostly because everyone knew each other. The network was much smaller back then, and everyone used the network the same way. It was not the free-for-all it is today. That said, the underlying infrastructure of the network hasn't changed all that much, but what is being exchanged over the network has changed. Today, people are managing their money, conducting business transactions, and hosting sensitive data over the Net.
The Internet still works fine as long as we trust each other. You know, that same kind of trust that lets us walk down the street, go to the store, or sing karaoke at the local bar without fear. In fact, without trust, you would never buy anything from Amazon or eBay again—let alone eat a hot dog.
Now, I don't know about you, but I don't trust everyone. I also want to keep my private data private and not let it leak out of my applications like motor oil from an old Buick. So, we must inspect the entire surface of the application and make sure the data stays in and the bad guys stay out. I start by asking myself how could data escape the system? Where can data be found or accessed? What security measures are currently in place to protect the data?
Some examples of where data leaks might occur are:
  • Runtime errors printed to the standard error or output stream. Depending on configuration, this information could be displayed to a system console or to an unprotected log file exposing details about the system and its operation.
  • Sensitive data is displayed to the user via web browser in a hidden field, HTTP cookie, or an HTML comment. Data hidden on the page can be revealed simply by viewing the source.
  • Debug code that outputs system data to the console or to an unprotected log file.
In this chapter, I am going to explore the major protocols associated with web applications, where the seams are, and what the possible attack vectors might be, and offer some recommended countermeasures to help make applications more secure.
As security-minded developers, it is important to take care in handling security-related information. The following examples and code should be tried only on a development system in a closed environment, not on a public Internet server.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
HTTP Vulnerabilities
Hypertext—the operative word being text—is just text. Anyone can read it! It doesn't say secure text, private text, or keep your mitts off my data text. No, it says hypertext, which by itself is a little troubling.
HTTP was not designed with security in mind. It is a protocol for exchanging text and other types of files via links. The following sections are examples where the use of HTTP can lead to vulnerabilities.
A common mistake application developers make is assigning input values originating from an HTTP request and directly using them without inspecting them first.
In Java:
user = request.getParameter("user");
In .NET:
User = Request.getParameter("user");
In PHP:
$user = $_POST['user'];
In each case the problem is the same. A variable posted via HTTP is plopped into an object (abstract representation of the request), and the programmer uses that object's value without validating or cleansing the data. It's easy to do, the code works, so why not?
Now, there are three legs to this stool. First, you need to know that the data is good data. Did the data come from a trusted source? Second, integrity checks must be included wherever data passes from a trusted to a less trusted boundary, such as from the application to the user's browser in a hidden field, or to a third-party payment gateway, such as a transaction ID used internally upon return. Finally, security controls need to be in place that will help with preserve data integrity—everything from hashes and checksums of the data to digital encryption. The point is that you must take steps to ensure the data you are getting is good data.
Validation and Integrity checks need to be in place to protect your application from tainted data. Validation must be performed at every entry point to your application. Each entry point should validate for the functions it can perform. For example, if data enters into your application from the Web, the data in the web tier should be tested for web defects. As the data moves into the business logic portion of the application, different validations need to occur. The point is that the data should always be validated before it is used.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Threats
The following are some common threats against web applications, ranging from the most common and dangerous forms of cross-site scripting to more legacy vulnerabilities such as buffer overflows and other data handling issues.
Cross-site scripting (XSS) is a common form of web attack where malicious script or other code that is included in an HTTP response is involuntarily executed by the user's browser. These types of attacks can take almost any form and can be extremely dangerous. Often the attacks include sending private data such as cookies to an attacker. This can be done by redirecting the victim's browser to a web site controlled by the attacker.
Usually, identity theft is what the attackers are looking for here. Attackers steal session identifiers or a user's login credentials and impersonate that victim on legitimate sites. Web applications can be used as a mechanism to transport attacks to an end user's browser. Successful attacks can disclose session tokens, spoof content, or otherwise trick the victim into believing they are on a legitimate web site. After an attacker has navigational control of the victim's session, the game is over.
XSS comes in two basic flavors:
Reflected XSS
Data is reflected immediately back to the browser from data injected on the URL or request—the idea being an attacker formulates a link that includes the malicious script, and the victim clicks that link:
<!-- Reflected XSS example -->
<%= request.getParameter("myVar"); %>
This JSP code can be exploited by assigning script to the value for myVar. Here's an example of how a script might get injected using a link on a web page:
<a href=http://www.somesite.com/reflectedExample/index.html?myVar=<script>
alert('gotcha');</script>>Click here for your free iPhone!</a>
Persisted XSS
An attacker (somehow) manages to get her script stored on the server—as in a database value—and the victim views a pages that dynamically renders that value and executes the script.
This code is vulnerable to a persisted XSS attack.
<!-- Persisted XSS example -->
<% myVar = [VALUE FROM DATABASE]; %>
<$= myVar %>  <!-- value is output directly without encoding -->
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
JSON
JavaScript Object Notation (JSON) is a lightweight data format based on the object notation of the JavaScript language. Unlike XML, JSON is already JavaScript so it does not have to endure heavy processing. Because of its ease of use and flexibility to exchange data, it has gained popularity. If you are thinking of using JSON, I would recommend you check out the web site ().
shows a simple JSON structure.
Example 5-7. An example of JSON notation
{
    "type": "Menu",
    "value": "File",
    "items": [
        {"value": "New", "action": "CreateNewDocument"},
        {"value": "Open", "action": "OpenDocument"},
        {"value": "Save", "action": "SaveDocument"}
    ]
}
JSON was designed to be highly portable. It's what makes it useful. JSON output text can be directly interpreted by JavaScript, using eval( ):
var myVar = eval( '(' + jsontext + ')' );

Section 5.3.1.1: Validation and implementation

Passing JSON text straight into the eval( ) function is a bit like setting a bull loose in a china shop, since eval( ) will blindly interpret everything in the JSON text with no security or validation checking, but boy is it fast. So, what's wrong with automatically hydrating this stuff? The most obvious attack is XSS. Consider what would happen if the code in were run through eval( ).
shows XSS in JSON.
Example 5-8. Unvalidated JSON
{
    "name": "menu",
    "value": "File",
    "items": [
        {"value": "New", "action": "CreateNewDocument"},
         {"value": "Open", "action": "OpenDocument"},
        {"value": "Save", "action": "SaveDocument"}
    ]
});alert('Gotcha!!'
Various JSON validators are available on the Internet, including even one from the JSON web site (). I strongly suggest using one if you are going to work with JSON.
Another problem in implementing JSON is in not properly declaring a mime-type. If JSON text is sent directly to the browser with a mime-type of text/html, the browser will render the JSON as if it were HTML—even if it's really just a JavaScript fragment. The easiest way to protect against this is to ensure that all JavaScript received by the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
XML
The Extensible Markup Language (XML) is a markup language for describing information in documents in a structured way. XML is human readable, which makes it desirable from a development and integration point of view. What makes XML structured is that documents contain both content and metadata that describes that content.
Almost all documents have some structure, so XML is a great way of standardizing that structure into one common format. In web applications, XML is the preferred data exchange format and serves as the foundation of many web protocols and data interchange formats.
XML does not, by itself, have any security features. The following are examples where the use of XML can lead to vulnerabilities.
All information from web requests (or request made outside your network) that are not validated before being used in a web application should be considered tainted. This includes XML. Attackers can exploit vulnerabilities and use these flaws to attack backend components through a web application.
If XML data is accepted as input to a web application it is possible for an attacker to alter the values embedded in the XML to attack the system.
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE greeting [
  <!ELEMENT greeting (#PCDATA)>
]>
<greeting><script>alert('Gotcha!');</script></greeting>
As with all input data, XML data should also be validated before it used. This is particularly true when the XML is being used in the browser, as in the case of Ajax.
Often XML is used to transfer data between different systems using web services. In these cases, the connections between the systems need to know that they are reliably connecting to trusted systems and not hackers. Authenticating web services requests and limiting service requests with authorization checks can help preserve the confidentiality and integrity of the XML data being exchanged.
Restrictions on what authenticated users are allowed to do are not properly enforced. Attackers can exploit these flaws to access other users' accounts, view sensitive files, or use unauthorized functions.
XML is a popular format for exchanging data. If unvalidated input is used in the construction of XML (for example, XML, XPath queries, XSLT, and so on), the code could be vulnerable to XML injection. If an attacker can inject data to alter XML transactions, XPath queries, or XSLT transformations, she could expose or destroy data, gain privileges, or cause a denial of service.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
RSS
Real Simple Syndication (RSS) is a syndication format used to publish frequently updated pages, such as blogs or news feeds. You would think that with all those S's one would mean security. Nope, I guess they missed that one. RSS formats are specified in XML, and RSS delivers its information via an XML file called an RSS feed, web feed, RSS stream, or RSS channel.
These web feeds allow software programs to check for updates published on a web site. To host a web feed, a web site uses specialized software (such as a content management system) to publish a list (or feed) of content. RSS helps ensure the content is standardized, machine-readable format. The feed can then be downloaded by aggregators or distributors that syndicate content from the feed, or by feed reader programs that allow Internet users to subscribe to feeds and view their content. On web pages, web feeds are identified with words such as subscribe, or with an orange image with the letters RSS, or XML.
RSS helped create the concept of podcasting by supporting enclosures—attachments bundled into the XML and raw data. RSS is still the preferred syndication format for many podcasting applications such as Apple's iTunes. RSS has attracted large groups of supporters who remain satisfied by the specification and its capabilities.

Section 5.5.1.1: Consuming RSS

RSS is difficult to consume safely. The difficulty starts with the RSS specification, which allows for description elements to contain arbitrary entity-encoded HTML. Although this is great for feeds that publish RSS, it makes writing a secure RSS consumer application exceedingly difficult.
Because HTML can carry such dangerous content (such as scripts, ActiveX, remote images and CSS, or CSS that can take over the entire screen or contain JavaScript), it must be inspected before it is consumed.
Sadly, it is up to the RSS consumer to protect the content and not the feed's provider. In short, output encoding needs to be applied to all RSS data. Any harmful CSS, HTML, or JavaScript tags should also be removed.
The following are things to consider removing when parsing RSS feeds:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Atom
Atom is another XML syndication format that is used for creating web feeds. Atom Publishing Protocol (APP) is a simple HTTP-based protocol for creating and updating web resources.
Like RSS, Atom feeds are used for the syndication of web content such as in Weblogs and headlines. Feeds usually contain a title and entries, which can be headlines, full-text articles, links, summaries, or other content.

Section 5.6.1.1: Atom compared to RSS

RSS, having arrived first to the syndication scene, was not perfect. Poor interoperability and incompatibility with earlier versions showed the need for a new standard. A faction of developers split off and formed Atom as a new syndication standard.
Here some ways that Atom attempts to distinguish itself from RSS:
  1. Atom can distinguish between different content types such as HTML and plain text.
  2. Atom defines itself within an XML name space.
  3. Atom requires each entry to be unique by using a unique identifier.
  4. Atom has separate elements for summary and content. Rather than simply providing a description, Atom attempts to distinguish between summary and content by providing the ability to include nontextual content in a summary.
  5. Atom includes a standard for auto-discovery—a process by which news readers and browsers can automatically know whether a page supplies a feed.
  6. Atom requires xml:base for relative URIs—providing the ability to distinguish between relative and nonrelative URIs.
  7. Atom also uses the xml:lang attribute rather than introduce its own proprietary language element.
  8. Rather than require full feed documents, Atom can also supply smaller Atom entry documents.
  9. Atom standardizes on dates by conforming to the format described in RFC 3339—a subset of ISO 8601 (the International Organization Standard [ISO] for date and time notation).
  10. Atom uses a real, IANA-registered, MIME type: application/atom+xml.
  11. Atom further conforms to XML standards by including an XML schema.
  12. Atom provides some description about how feeds and entries can be digitally signed using XML digital signatures.
A way of gaining trust with users is to prove that you are legit.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
REST
In an attempt to tame the free-for-all that is the Web, Roy Fielding (a guy who has been working with the Apache Web Server Project forever) wrote his doctoral dissertation about how web resources should be named and used on the Internet to help better facilitate the exchange of data and the use of web services.
In Fielding's own words:
Representational State Transfer (REST) is intended to evoke an image of how a well-designed Web application behaves: a network of web pages (a virtual state-machine), where the user progresses through an application by selecting links (state transitions), resulting in the next page (representing the next state of the application) being transferred to the user and rendered for their use.
REST is concerned with the architecture of the Web. It does not address implementation details (such as using Java servlets, .NET, or CGI to implement a web service). REST is all about how resources are presented and used. It is not about specific implementation. It is an architectural style of building an application in a standard way.
Also, as a matter of style and from a security (information leakage) point of view, URLs should not reveal the implementation technique being used. You need to be free to change your implementation without impacting clients or having misleading URLs.

Section 5.7.1.1: REST web services characteristics

Here are the characteristics of REST:
Client-server
A pull-based interaction style. Components pull representations from the server.
Stateless
Each request to the server must contain all the information necessary to understand the request without taking advantage of any stored context on the server.
Cache
HTTP responses must be capable of being labeled cacheable or noncacheable for use with proxies and other web caching mechanisms.
Uniform interface
HTTP resources are accessed with the existing HTTP verbs (for example, HTTP GET, POST, PUT, DELETE).
Named resources
Systems are comprised of resources, which are named using a URL only.
Interconnected resource representations
The representations of the resources are linked using URLs. Clients are allowed to progress from one state to another.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 6: Securing Web Services
Web services are a collection of Internet technologies that expose application functions on the Web and allow machines in different locations to talk to one another. Applications use web services to share and process information—making federated applications. The basic idea is to promote component driven applications and component reuse. You chose what