BUY THIS BOOK
Add to Cart

Print Book $44.95


Safari Books Online

What is this?

Add to UK Cart

Print Book £31.95

What is this?

Looking to Reprint this content?

Java Servlet Programming
Java Servlet Programming, Second Edition

By Jason Hunter
With William燙rawford
Price: $44.95 USD
£31.95 GBP

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introduction
The rise of server-side Java applications鈥攅verything from standalone servlets to the full Java 2, Enterprise Edition (J2EE), platform鈥攈as been one of the most exciting trends to watch in Java programming. The Java language was originally intended for use in small, embedded devices. It was first hyped as a language for developing elaborate client-side web content in the form of applets. But until the last few years, Java's potential as a server-side development platform had been sadly overlooked. Now, Java has come to be recognized as a language ideally suited for server-side development.
Businesses in particular have been quick to recognize Java's potential on the server鈥擩ava is inherently suited for large client/server applications. The cross-platform nature of Java is extremely useful for organizations that have a heterogeneous collection of servers running various flavors of the Unix and Windows (and increasingly Mac OS X) operating systems. Java's modern, object-oriented, memory-protected design allows developers to cut development cycles and increase reliability. In addition, Java's built-in support for networking and enterprise APIs provides access to legacy data, easing the transition from older client/server systems.
Java servlets are a key component of server-side Java development. A servlet is a small, pluggable extension to a server that enhances the server's functionality. Servlets allow developers to extend and customize any Java-enabled web or application server with a hitherto unknown degree of portability, flexibility, and ease. But before we go into any more detail, let's put things into perspective.
While servlets can be used to extend the functionality of any Java-enabled server, they are most often used to extend web servers, providing a powerful, efficient replacement for CGI scripts. When you use a servlet to create dynamic content for a web page or otherwise extend the functionality of a web server, you are in effect creating a web application. While a web page merely displays static content and lets the user navigate through that content, a web application provides a more interactive experience. A web application can be as simple as a keyword search on a document archive or as complex as an electronic storefront. Web applications are being deployed on the Internet and on corporate intranets and extranets, where they have the potential to increase productivity and change the way that companies, large and small, do business.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
History of Web Applications
While servlets can be used to extend the functionality of any Java-enabled server, they are most often used to extend web servers, providing a powerful, efficient replacement for CGI scripts. When you use a servlet to create dynamic content for a web page or otherwise extend the functionality of a web server, you are in effect creating a web application. While a web page merely displays static content and lets the user navigate through that content, a web application provides a more interactive experience. A web application can be as simple as a keyword search on a document archive or as complex as an electronic storefront. Web applications are being deployed on the Internet and on corporate intranets and extranets, where they have the potential to increase productivity and change the way that companies, large and small, do business.
To understand the power of servlets, we need to step back and look at some of the other approaches that can be used to create web applications.
The Common Gateway Interface, normally referred to as CGI, was one of the first practical techniques for creating dynamic content. With CGI, a web server passes certain requests to an external program. The output of this program is then sent to the client in place of a static file. The advent of CGI made it possible to implement all sorts of new functionality in web pages, and CGI quickly became a de facto standard, implemented on dozens of web servers.
It's interesting to note that the ability of CGI programs to create dynamic web pages is a side effect of its intended purpose: to define a standard method for an information server to talk with external applications. This origin explains why CGI has perhaps the worst life cycle imaginable. When a server receives a request that accesses a CGI program, it must create a new process to run the CGI program and then pass to it, via environment variables and standard input, every bit of information that might be necessary to generate a response. Creating a process for every such request requires time and significant server resources, which limits the number of requests a server can handle concurrently. Figure 1-1 shows the CGI life cycle.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Support for Servlets
Like Java itself, servlets were designed for portability. Servlets are supported on all platforms that support Java, and servlets work with all the major web servers. Java servlets, as defined by the Java Software division of Sun Microsystems (formerly known as JavaSoft), are an Optional Package to Java (formerly known as a Standard Extension). This means that servlets are officially blessed by Sun and are part of the Java language, but they are not part of the core Java API. Instead, they are now recognized as part of the J2EE platform.
To make it easy for you to develop servlets, Sun and Apache have made available the API classes separately from any web engine. The javax.servlet and javax.servlet.http packages constitute this Servlet API. The latest version of these classes is available for download from http://java.sun.com/products/servlet/download.html. All web servers that support servlets must use these classes internally (although they could use an alternate implementation), so generally this JAR file can also be found somewhere within the distribution of your servlet-enabled web server.
It doesn't much matter where you get the servlet classes, as long as you have them on your system, since you need them to compile your servlets. In addition to the servlet classes, you need a servlet runner (technically called a servlet container , sometimes called a servlet engine), so that you can test and deploy your servlets. Your choice of servlet container depends in part on the web server(s) you are running. There are three flavors of servlet containers: standalone , add-on, and embeddable.
A standalone servlet container is a server that includes built-in support for servlets. Such a container has the advantage that everything works right out of the box. One disadvantage, however, is that you have to wait for a new release of the web server to get the latest servlet support. Another disadvantage is that server vendors generally support only the vendor-provided JVM. Web servers that provide standalone support include those in the following list.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Power of Servlets
So far, we have portrayed servlets as an alternative to other dynamic web content technologies, but we haven't really explained why we think you should use them. What makes servlets a viable choice for web development? We believe that servlets offer a number of advantages over other approaches, including portability, power, efficiency, endurance, safety, elegance, integration, extensibility, and flexibility. Let's examine each in turn.
Because servlets are written in Java and conform to a well-defined and widely accepted API, they are highly portable across operating systems and across server implementations. You can develop a servlet on a Windows NT machine running the Tomcat server and later deploy it effortlessly on a high-end Unix server running the iPlanet/Netscape Application Server. With servlets, you can truly "write once, serve everywhere."
Servlet portability is not the stumbling block it so often is with applets, for two reasons. First, servlet portability is not mandatory. Unlike applets, which have to be tested on all possible client platforms, servlets have to work only on the server machines that you are using for development and deployment. Unless you are in the business of selling your servlets, you don't have to worry about complete portability. Second, servlets avoid the most error-prone and inconsistently implemented portion of the Java language: the Abstract Windowing Toolkit (AWT) that forms the basis of Java graphical user interfaces, including Swing.
Servlets can harness the full power of the core Java APIs: networking and URL access, multithreading, image manipulation, data compression, database connectivity (JDBC), object serialization, internationalization, remote method invocation (RMI), and legacy integration (CORBA). Servlets can also take advantage of the J2EE platform that includes support for Enterprise JavaBeans (EJBs), distributed transactions (JTS), standardized messaging (JMS), directory lookup (JNDI), and advanced database access (JDBC 2.0). The list of standard APIs available to servlets continues to grow, making the task of web application development faster, easier, and more reliable.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: HTTP Servlet Basics
This chapter provides a short tutorial on how to write and execute a simple HTTP servlet. Then it explains how to deploy the servlet in a standard web application and how to configure the servlet's behavior using an XML-based deployment descriptor.
Unlike the first edition, this chapter does not cover servlet-based server-side includes (SSI) or servlet chaining and filtering. This is because those techniques, as useful as they were and despite the fact they were implemented in the Java Web Server, have not been officially endorsed by the servlet specification (which came out after the first edition of this book was published). SSI has been replaced by new techniques for doing programmatic includes. Servlet chaining has been decreed too inelegant for official endorsement, although the basic idea seems likely to reappear in Servlet API 2.3 as part of an official general-purpose pre- and post-filtering mechanism.
Note that the code for each of the examples in this chapter and throughout the book is available for download in both source and compiled form (as described in the preface). However, for this first chapter, we suggest that you deny yourself the convenience of the Internet and take the time to type in the examples. It should help the concepts seep into your brain. Don't be alarmed if we seem to skim lightly over some topics in this chapter. Servlets are powerful and, at times, complicated. The point here is to give you a general overview of how things work, before jumping in and overwhelming you with all of the details. By the end of this book, we promise that you'll be able to write servlets that do everything but make tea.
Before we can even show you a simple HTTP servlet, we need to make sure that you have a basic understanding of how the protocol behind the Web, HTTP, works. If you're an experienced CGI programmer (or if you've done any serious server-side web programming), you can safely skip this section. Better yet, you might skim it to refresh your memory about the finer points of the GET and POST methods. If you are new to the world of server-side web programming, however, you should read this material carefully, as the rest of the book is going to assume that you understand HTTP. For a more thorough discussion of HTTP and its methods, see
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
HTTP Basics
Before we can even show you a simple HTTP servlet, we need to make sure that you have a basic understanding of how the protocol behind the Web, HTTP, works. If you're an experienced CGI programmer (or if you've done any serious server-side web programming), you can safely skip this section. Better yet, you might skim it to refresh your memory about the finer points of the GET and POST methods. If you are new to the world of server-side web programming, however, you should read this material carefully, as the rest of the book is going to assume that you understand HTTP. For a more thorough discussion of HTTP and its methods, see HTTP Pocket Reference by Clinton Wong (O'Reilly).
HTTP is a simple, stateless protocol. A client, such as a web browser, makes a request, the web server responds, and the transaction is done. When the client sends a request, the first thing it specifies is an HTTP command, called a method , that tells the server the type of action it wants performed. This first line of the request also specifies the address of a document (a URL) and the version of the HTTP protocol it is using. For example:
GET /intro.html HTTP/1.0
This request uses the GET method to ask for the document named intro.html, using HTTP Version 1.0. After sending the request, the client can send optional header information to tell the server extra information about the request, such as what software the client is running and what content types it understands. This information doesn't directly pertain to what was requested, but it could be used by the server in generating its response. Here are some sample request headers:
User-Agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 95)
Accept: image/gif, image/jpeg, text/*, */*
The User-Agent header provides information about the client software, while the Accept header specifies the media (MIME) types that the client prefers to accept. (We'll talk more about request headers in the context of servlets in Chapter 4.) After the headers, the client sends a blank line, to indicate the end of the header section. The client can also send additional data, if appropriate for the method being used, as it is with the POST method that we'll discuss shortly. If the request doesn't send any data, it ends with an empty line.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Servlet API
Now that you have a basic understanding of HTTP, we can move on and talk about the Servlet API that you'll be using to create HTTP servlets, or any kind of servlets, for that matter. Servlets use classes and interfaces from two packages: javax.servlet and javax.servlet.http. The javax.servlet package contains classes and interfaces to support generic, protocol-independent servlets. These classes are extended by the classes in the javax.servlet.http package to add HTTP-specific functionality. The top-level package name is javax instead of the familiar java, to indicate that the Servlet API is an Optional Package (formerly called a Standard Extension).
Every servlet must implement the javax.servlet.Servlet interface. Most servlets implement this interface by extending one of two special classes: javax.servlet.GenericServlet or javax.servlet.http.HttpServlet . A protocol-independent servlet should subclass GenericServlet, while an HTTP servlet should subclass HttpServlet, which is itself a subclass of GenericServlet with added HTTP-specific functionality.
Unlike a regular Java program, and just like an applet, a servlet does not have a main( ) method. Instead, certain methods of a servlet are invoked by the server in the process of handling requests. Each time the server dispatches a request to a servlet, it invokes the servlet's service( ) method.
A generic servlet should override its service( ) method to handle requests as appropriate for the servlet. The service( ) method accepts two parameters: a request object and a response object. The request object tells the servlet about the request, while the response object is used to return a response. Figure 2-1 shows how a generic servlet handles requests.
Figure 2-1: A generic servlet handling a request
In contrast, an HTTP servlet usually does not override the service( ) method. Instead, it overrides doGet( ) to handle GET requests and doPost( ) to handle POST requests. An HTTP servlet can override either or both of these methods, depending on the type of requests it needs to handle. The
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Page Generation
The most basic type of HTTP servlet generates a full HTML page. Such a servlet has access to the same information usually sent to a CGI script, plus a bit more. A servlet that generates an HTML page can be used for all the tasks for which CGI is used currently, such as for processing HTML forms, producing reports from a database, taking orders, checking identities, and so forth.
Example 2-1 shows an HTTP servlet that generates a complete HTML page. To keep things as simple as possible, this servlet just says "Hello World" every time it is accessed via a web browser.
Example 2-1. A Servlet That Prints "Hello World"
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class HelloWorld extends HttpServlet {

  public void doGet(HttpServletRequest req, HttpServletResponse res)
                               throws ServletException, IOException {

    res.setContentType("text/html");
    PrintWriter out = res.getWriter();

    out.println("<HTML>");
    out.println("<HEAD><TITLE>Hello World</TITLE></HEAD>");
    out.println("<BODY>");
    out.println("<BIG>Hello World</BIG>");
    out.println("</BODY></HTML>");
  }
}
This servlet extends the HttpServlet class and overrides the doGet( ) method inherited from it. Each time the web server receives a GET request for this servlet, the server invokes this doGet( ) method, passing it an HttpServletRequest object and an HttpServletResponse object.
The HttpServletRequest represents the client's request. This object gives a servlet access to information about the client, the parameters for this request, the HTTP headers passed along with the request, and so forth. Chapter 4 explains the full capabilities of the request object. For this example, we can completely ignore it. After all, this servlet is going to say "Hello World" no matter what the request!
The HttpServletResponse represents the servlet's response. A servlet can use this object to return data to the client. This data can be of any content type, though the type should be specified as part of the response. A servlet can also use this object to set HTTP response headers. Chapter 5 and Chapter 6, explain everything a servlet can do as part of its response.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Web Applications
A web application (sometimes shortened to web app) is a collection of servlets, Java-Server Pages (JSPs), HTML documents, images, templates, and other web resources that are set up in such a way as to be portably deployed across any servlet-enabled web server. By having everyone agree on exactly where files in a web application are to be placed and agreeing on a standard configuration file format, a web app can be transferred from one server to another easily without requiring any extra server administration. Gone are the days of detailed instruction sheets telling you how to install third-party web components, with different instructions for each type of web server.
All the files under server_root/webapps/ROOT belong to a single web application (the root one). To simplify deployment, these files can be bundled into a single archive file and deployed to another server merely by placing the archive file into a specific directory. These archive files have the extension .war, which stands for web application archive. WAR files are actually JAR files (created using the jar utility) saved with an alternate extension. Using the JAR format allows WAR files to be stored in compressed form and have their contents digitally signed. The .war file extension was chosen over .jar to let people and tools know to treat them differently.
The file structure inside a web app is strictly defined. Example 2-4 shows a possible file listing.
Example 2-4. The File Structure Inside a Web Application
index.html
feedback.jsp
images/banner.gif
images/jumping.gif
WEB-INF/web.xml
WEB-INF/lib/bhawk4j.jar
WEB-INF/classes/MyServlet.class
WEB-INF/classes/com/mycorp/frontend/CorpServlet.class
WEB-INF/classes/com/mycorp/frontend/SupportClass.class
This hierarchy can be maintained as separate files under some server directory or they can be bundled together into a WAR file. On install, this web application can be mapped to any URI prefix path on the server. The web application then handles all requests beginning with that prefix. For example, if the preceding file structure were installed under the prefix
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Moving On
We realize this chapter has been a whirlwind introduction to servlets, web applications, and XML configuration files. By now, we hope you have an idea of how to write a simple servlet, install it on your server, and tell the server the paths for which you want it to be executed. Of course, servlets can do far more than say "Hello World" and greet users by name. Now that you've got your feet wet, we can dive into the details and move on to more interesting applications.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: The Servlet Lifecycle
The servlet lifecycle is one of the most exciting features of servlets. This lifecycle is a powerful hybrid of the lifecycles used by CGI programming and lower-level WAI/NSAPI and ISAPI programming, as discussed in Chapter 1.
The servlet lifecycle allows servlet containers to address both the performance and resource problems of CGI and the security concerns of low-level server API programming. A common way to execute servlets is for the servlet container to run all its servlets in a single Java Virtual Machine ( JVM). By placing all the servlets into the same JVM, the servlets can efficiently share data with one another, yet they are prevented by the Java language from accessing one another's private data. Servlets can persist between requests inside the JVM as object instances. This takes up far less memory than full-fledged processes, yet servlets still are able to efficiently maintain references to external resources.
The servlet lifecycle is highly flexible. The only hard and fast rule is that a servlet container must conform to the following lifecycle contract:
  1. Create and initialize the servlet.
  2. Handle zero or more service calls from clients.
  3. Destroy the servlet and then garbage collect it.
It's perfectly legal for a servlet to be loaded, created, and instantiated in its own JVM, only to be destroyed and garbage collected without handling any client requests or after handling just one request. Any servlet container that makes this a habit, however, probably won't last long on the open market. In this chapter we describe the most common and most sensible lifecycle implementations for HTTP servlets.
Most servlet containers want to execute all servlets in a single JVM to maximize the ability of servlets to share information. (The exception being high-end containers that support distributed servlet execution across multiple backend servers, as discussed in Chapter 12.) Where that single JVM executes can differ depending on the server:
  • With a server written in Java, such as the Apache Tomcat server, the server itself can execute inside a JVM right alongside its servlets.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Servlet Alternative
The servlet lifecycle allows servlet containers to address both the performance and resource problems of CGI and the security concerns of low-level server API programming. A common way to execute servlets is for the servlet container to run all its servlets in a single Java Virtual Machine ( JVM). By placing all the servlets into the same JVM, the servlets can efficiently share data with one another, yet they are prevented by the Java language from accessing one another's private data. Servlets can persist between requests inside the JVM as object instances. This takes up far less memory than full-fledged processes, yet servlets still are able to efficiently maintain references to external resources.
The servlet lifecycle is highly flexible. The only hard and fast rule is that a servlet container must conform to the following lifecycle contract:
  1. Create and initialize the servlet.
  2. Handle zero or more service calls from clients.
  3. Destroy the servlet and then garbage collect it.
It's perfectly legal for a servlet to be loaded, created, and instantiated in its own JVM, only to be destroyed and garbage collected without handling any client requests or after handling just one request. Any servlet container that makes this a habit, however, probably won't last long on the open market. In this chapter we describe the most common and most sensible lifecycle implementations for HTTP servlets.
Most servlet containers want to execute all servlets in a single JVM to maximize the ability of servlets to share information. (The exception being high-end containers that support distributed servlet execution across multiple backend servers, as discussed in Chapter 12.) Where that single JVM executes can differ depending on the server:
  • With a server written in Java, such as the Apache Tomcat server, the server itself can execute inside a JVM right alongside its servlets.
  • With a single-process, multithreaded web server written in another language, the JVM can often be embedded inside the server process. Having the JVM be part of the server process maximizes performance because a servlet becomes, in a sense, just another low-level server API extension. Such a server can invoke a servlet with a lightweight context switch and can provide information about requests through direct method invocations.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Servlet Reloading
If you tried using these counters for yourself, you may have noticed that any time you recompiled one, its count automatically began again at 1. Trust us鈥攊t's not a bug, it's a feature. Most servers automatically reload a servlet after its class file (under the default servlet directory, such as WEB-INF/classes) changes. It's an on-the-fly upgrade procedure that greatly speeds up the development-test cycle鈥攁nd allows for long server uptimes.
Servlet reloading may appear to be a simple feature, but it's quite a trick鈥攁nd requires quite a hack. ClassLoader objects are designed to load a class just once. To get around this limitation and load servlets again and again, servers use custom class loaders that load servlets from special directories such as WEB-INF/classes.
When a server dispatches a request to a servlet, it first checks whether the servlet's class file has changed on disk. If it has changed, the server abandons the class loader used to load the old version and creates a new instance of the custom class loader to load the new version. Some servers improve performance by checking modification timestamps only after some timeout since the previous check or upon explicit administrator request.
In Servlet API versions before 2.2, this class loader trick resulted in different servlets being loaded by different class loaders, a situation that would sometimes cause a ClassCastException to be thrown when the servlets shared information (because a class loaded by one class loader is not the same as the class loaded by a second class loader, even if the underlying class data is identical). Beginning in Servlet API 2.2, it's mandated that these ClassCastException problems must not occur for servlets inside the same context. So most server implementations now load each web application context within a single class loader and use a new class loader to reload the entire context when any servlet in the context changes. Since all servlets and support classes in the context always have the same class loader, there will be no unexpected
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Init and Destroy
Just like applets, servlets can define init( ) and destroy( ) methods. The server calls a servlet's init( ) method after the server constructs the servlet instance and before the servlet handles any requests. The server calls the destroy( ) method after the servlet has been taken out of service and all pending requests to the servlet have completed or timed out.
Depending on the server and the web application configuration, the init( ) method may be called at any of these times:
  • When the server starts
  • When the servlet is first requested, just before the service( ) method is invoked
  • At the request of the server administrator
In any case, init( ) is guaranteed to be called and completed before the servlet handles its first request.
The init( ) method is typically used to perform servlet initialization鈥攃reating or loading objects that are used by the servlet in the handling of its requests. During the init( ) method a servlet may want to read its initialization (init) parameters. These parameters are given to the servlet itself and are not associated with any single request. They can specify initial values, like where a counter should begin counting, or default values, perhaps a template to use when not specified by the request. Init parameters for a servlet are set in the web.xml deployment descriptor, although some servers have graphical interfaces for modifying this file. See Example 3-3.
Example 3-3. Setting init Parameters in the Deployment Descriptor
<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE web-app
    PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"
    "http://java.sun.com/j2ee/dtds/web-app_2_2.dtd">

<web-app>
    <servlet>
        <servlet-name>
            counter
        </servlet-name>
        <servlet-class>
            InitCounter
        </servlet-class>
        <init-param>
            <param-name>
                initial
            </param-name>
            <param-value>
                1000
            </param-value>
            <description>
                The initial value for the counter  <!-- optional -->
            </description>
        </init-param>
    </servlet>
</web-app>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Single-Thread Model
Although the normal situation is to have one servlet instance per registered servlet name, it is possible for a servlet to elect instead to have a pool of instances created for each of its names, all sharing the duty of handling requests. Such servlets indicate this desire by implementing the javax.servlet.SingleThreadModel interface. This is an empty, "tag" interface that defines no methods or variables and serves only to flag the servlet as wanting the alternate lifecycle.
A server that loads a SingleThreadModel servlet must guarantee, according to the Servlet API documentation, "that no two threads will execute concurrently in the servlet's service method." To accomplish this, each thread uses a free servlet instance from the pool, as shown in Figure 3-3. Thus, any servlet implementing SingleThreadModel can be considered thread safe and isn't required to synchronize access to its instance variables. Some servers allow the number of instances per pool to be configured, others don't. Some servers use pools with just one instance, causing behavior identical to a synchronized service( ) method.
Figure 3-3: The single-thread model
A SingleThreadModel lifecycle is pointless for a counter or other servlet application that requires central state maintenance. The lifecycle can be of some use, however, in avoiding synchronization while still performing efficient request handling.
For example, a servlet that connects to a database sometimes needs to perform several database commands atomically as part of a single transaction. Each database transaction requires a dedicated database connection object, so the servlet somehow needs to ensure no two threads try to access the same connection at the same time. This could be done using synchronization, letting the servlet manage just one request at a time. By instead implementing SingleThreadModel and having one "connection" instance variable per servlet, a servlet can easily handle concurrent requests because each instance has its own connection. The skeleton code is shown in Example 3-6.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Background Processing
Servlets can do more than simply persist between accesses. They can also execute between accesses. Any thread started by a servlet can continue executing even after the response has been sent. This ability proves most useful for long-running tasks whose incremental results should be made available to multiple clients. A background thread started in init( ) performs continuous work while request-handling threads display the current status with doGet( ). It's a similar technique to that used in animation applets, where one thread changes the picture and another paints the display.
Example 3-7 shows a servlet that searches for prime numbers above one quadrillion. It starts with such a large number to make the calculation slow enough to adequately demonstrate caching effects鈥攕omething we need for the next section. The algorithm it uses couldn't be simpler: it selects odd-numbered candidates and attempts to divide them by every odd integer between 3 and their square root. If none of the integers evenly divides the candidate, it is declared prime.
Example 3-7. On the Hunt for Primes
import java.io.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class PrimeSearcher extends HttpServlet implements Runnable {

  long lastprime = 0;                    // last prime found
  Date lastprimeModified = new Date();   // when it was found
  Thread searcher;                       // background search thread

  public void init() throws ServletException {
    searcher = new Thread(this);
    searcher.setPriority(Thread.MIN_PRIORITY);  // be a good citizen
    searcher.start();
  }

  public void run() {
    //               QTTTBBBMMMTTTOOO
    long candidate = 1000000000000001L;  // one quadrillion and one

    // Begin loop searching for primes
    while (true) {                       // search forever
      if (isPrime(candidate)) {
        lastprime = candidate;           // new prime
        lastprimeModified = new Date();  // new "prime time"
      }
      candidate += 2;                    // evens aren't prime

      // Between candidates take a 0.2 second break.
      // Another way to be a good citizen with system resources.
      try {
        searcher.sleep(200);
      }
      catch (InterruptedException ignored) { }
    }
  }

  private static boolean isPrime(long candidate) {
    // Try dividing the number by all odd numbers between 3 and its sqrt
    long sqrt = (long) Math.sqrt(candidate);
    for (long i = 3; i <= sqrt; i += 2) {
      if (candidate % i == 0) return false;  // found a factor
    }

    // Wasn't evenly divisible, so it's prime
    return true;
  }

  public void doGet(HttpServletRequest req, HttpServletResponse res)
                               throws ServletException, IOException {
    res.setContentType("text/plain");
    PrintWriter out = res.getWriter();
    if (lastprime == 0) {
      out.println("Still searching for first prime...");
    }
    else {
      out.println("The last prime discovered was " + lastprime);
      out.println(" at " + lastprimeModified);
    }
  }

  public void destroy() {
    searcher.stop();
  }
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Load on Startup
To have the PrimeSearcher start searching for primes as quickly as possible, we can configure the servlet's web application to load the servlet at server start. This is accomplished by adding the <load-on-startup> tag to the <servlet> entry of the deployment descriptor, as shown in Example 3-8.
Example 3-8. Loading a Servlet on Startup
<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE web-app
    PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"
    "http://java.sun.com/j2ee/dtds/web-app_2_2.dtd">

<web-app>
    <servlet>
        <servlet-name>
            ps
        </servlet-name>
        <servlet-class>
            PrimeSearcher
        </servlet-class>
        <load-on-startup/>
    </servlet>
</web-app>
This tells the server to create an instance of PrimeSearcher under the registered name ps and init( ) the servlet during the server's startup sequence. The servlet can then be accessed at the URL /servlet/ps. Note that the servlet instance handling the URL /servlet/PrimeSearcher is not loaded at startup.
The <load-on-startup> tag shown in Example 3-8 is empty. The tag can also contain a positive integer indicating the order in which the servlet should be loaded relative to other servlets in the context. Servlets with lower numbers are loaded before those with higher numbers. Servlets with negative values or noninteger values may be loaded at any time in the startup sequence, with the exact order depending on the server. For example, the web.xml shown in Example 3-9 guarantees first is loaded before second, while anytime could be loaded anytime during the server startup.
Example 3-9. A Little Servlet Parade
<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE web-app
    PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"
    "http://java.sun.com/j2ee/dtds/web-app_2_2.dtd">

<web-app>
    <servlet>
        <servlet-name>
            first
        </servlet-name>
        <servlet-class>
            First
        </servlet-class>
        <load-on-startup>10</load-on-startup>
    </servlet>
    <servlet>
        <servlet-name>
            second
        </servlet-name>
        <servlet-class>
            Second
        </servlet-class>
        <load-on-startup>20</load-on-startup>
    </servlet>
    <servlet>
        <servlet-name>
            anytime
        </servlet-name>
        <servlet-class>
            Anytime
        </servlet-class>
        <load-on-startup/>
    </servlet>
</web-app>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Client-Side Caching
By now, we're sure you've learned that servlets handle GET requests with the doGet( ) method. And that's almost true. The full truth is that not every request really needs to invoke doGet( ) . For example, a web browser that repeatedly accesses PrimeSearcher should need to call doGet( ) only after the searcher thread has found a new prime. Until that time, any call to doGet( ) just generates the same page the user has already seen, a page probably stored in the browser's cache. What's really needed is a way for a servlet to report when its output has changed. That's where the getLastModified( ) method comes in.
Most web servers, when they return a document, include as part of their response a Last-Modified header. An example Last-Modified header value might be:
Tue, 06-May-98 15:41:02 GMT
This header tells the client the time the page was last changed. That information alone is only marginally interesting, but it proves useful when a browser reloads a page.
Most web browsers, when they reload a page, include in their request an If-Modified-Since header. Its structure is identical to the Last-Modified header:
Tue, 06-May-98 15:41:02 GMT
This header tells the server the Last-Modified time of the page when it was last downloaded by the browser. The server can read this header and determine if the file has changed since the given time. If the file has changed, the server must send the newer content. If the file hasn't changed, the server can reply with a simple, short response that tells the browser the page has not changed, and it is sufficient to redisplay the cached version of the document. For those familiar with the details of HTTP, this response is the 304 Not Modified status code.
This technique works great for static pages: the server can use the filesystem to find out when any file was last modified. For dynamically generated content, though, such as that returned by servlets, the server needs some extra help. By itself, the best the server can do is play it safe and assume the content changes with every access, effectively eliminating the usefulness of the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Server-Side Caching
The getLastModified( ) method can be used, with a little trickery, to help manage a server-side cache of the servlet's output. Servlets implementing this trick can have their output caught and cached on the server side, then automatically resent to clients as appropriate according to the servlet's getLastModified( ) method. This can greatly speed servlet page generation, especially for servlets whose output takes a significant time to produce but changes only rarely, such as servlets that display database results.
To implement this server-side caching behavior, a servlet must:
  • Extend com.oreilly.servlet.CacheHttpServlet instead of HttpServlet
  • Implement a getLastModified(HttpServletRequest) method as usual
Example 3-10 shows a servlet taking advantage of CacheHttpServlet . It's a guestbook servlet that displays user-submitted comments. The servlet stores the user comments in memory as a Vector of GuestbookEntry objects. We'll see a version of this servlet running off a database in Chapter 9. For now, to simulate reading from a slow database, the display loop has a half-second delay per entry. As the entry list gets longer, the rendering of the page gets slower. However, because the servlet extends CacheHttpServlet, the rendering only has to occur during the first GET request after a new comment is added. All later GET requests send the cached response. Sample output is shown in Figure 3-4.
Example 3-10. A Guestbook Using CacheHttpServlet
import java.io.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;

import com.oreilly.servlet.CacheHttpServlet;

public class Guestbook extends CacheHttpServlet {

  private Vector entries = new Vector();  // User entry list
  private long lastModified = 0;          // Time last entry was added
  
  // Display the current entries, then ask for a new entry
  public void doGet(HttpServletRequest req, HttpServletResponse res) 
                               throws ServletException, IOException {
    res.setContentType("text/html");
    PrintWriter out = res.getWriter();

    printHeader(out);
    printForm(out);
    printMessages(out);
    printFooter(out);
  }

  // Add a new entry, then dispatch back to doGet()
  public void doPost(HttpServletRequest req, HttpServletResponse res) 
                                throws ServletException, IOException {
    handleForm(req, res);
    doGet(req, res);
  }

  private void printHeader(PrintWriter out) throws ServletException {
    out.println("<HTML><HEAD><TITLE>Guestbook</TITLE></HEAD>");
    out.println("<BODY>");
  }

  private void printForm(PrintWriter out) throws ServletException {
    out.println("<FORM METHOD=POST>");  // posts to itself
    out.println("<B>Please submit your feedback:</B><BR>");
    out.println("Your name: <INPUT TYPE=TEXT NAME=name><BR>");
    out.println("Your email: <INPUT TYPE=TEXT NAME=email><BR>");
    out.println("Comment: <INPUT TYPE=TEXT SIZE=50 NAME=comment><BR>");
    out.println("<INPUT TYPE=SUBMIT VALUE=\"Send Feedback\"><BR>");
    out.println("</FORM>");
    out.println("<HR>");
  }

  private void printMessages(PrintWriter out) throws ServletException {
    String name, email, comment;

    Enumeration e = entries.elements();
    while (e.hasMoreElements()) {
      GuestbookEntry entry = (GuestbookEntry) e.nextElement();
      name = entry.name;
      if (name == null) name = "Unknown user";
      email = entry.email;
      if (name == null) email = "Unknown email";
      comment = entry.comment;
      if (comment == null) comment = "No comment";
      out.println("<DL>");
      out.println("<DT><B>" + name + "</B> (" + email + ") says");
      out.println("<DD><PRE>" + comment + "</PRE>");
      out.println("</DL>");

      // Sleep for half a second to simulate a slow data source
      try { Thread.sleep(500); } catch (InterruptedException ignored) { }
    }
  }

  private void printFooter(PrintWriter out) throws ServletException {
    out.println("</BODY>");
  }

  private void handleForm(HttpServletRequest req,
                          HttpServletResponse res) {
    GuestbookEntry entry = new GuestbookEntry();

    entry.name = req.getParameter("name");
    entry.email = req.getParameter("email");
    entry.comment = req.getParameter("comment");

    entries.addElement(entry);

    // Make note we have a new last modified time
    lastModified = System.currentTimeMillis();
  }

  public long getLastModified(HttpServletRequest req) {
    return lastModified;
  }
}

class GuestbookEntry {
  public String name;
  public String email;
  public String comment;
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Retrieving Information
To build a successful web application, you often need to know a lot about the environment in which it is running. You may need to find out about the server that is executing your servlets or the specifics of the client that is sending requests. And no matter what kind of environment the application is running in, you most certainly need information about the requests that the application is handling.
A number of methods provide servlets access to this information. For the most part, each method returns one specific result. Compared this to the way environment variables are used to pass a CGI program its information, the servlet approach has several advantages:
  • Stronger type checking. Servlets get more help from the compiler in catching errors. A CGI program uses one function to retrieve its environment variables. Many errors cannot be found until they cause runtime problems. Let's look at how a CGI program and a servlet find the port on which its server is running.
    A CGI script written in Perl calls:
    $port = $ENV{'SERVER_PORT'};
    where $port is an untyped variable. A CGI program written in C calls:
    char *port = getenv("SERVER_PORT");
    where port is a pointer to a character string. The chance for accidental errors is high. The environment variable name could be misspelled (it happens often enough) or the datatype might not match what the environment variable returns.
    A servlet, on the other hand, calls:
    int port = req.getServerPort()
    This eliminates a lot of accidental errors because the compiler can guarantee there are no misspellings and each return type is as it should be.
  • Delayed calculation. When a server launches a CGI program, the value for each and every environment variable must be precalculated and passed, whether the CGI program uses it or not. A server launching a servlet has the option to improve performance by delaying these calculations and performing them on demand as needed.
  • More interaction with the server. Once a CGI program begins execution, it is untethered from its server. The only communication path available to the program is its standard output. A servlet, however, can work with the server. As discussed in the previous chapter, a servlet operates either within the server (when possible) or as a connected process outside the server (when necessary). Using this connectivity, a servlet can make ad hoc requests for calculated information that only the server can provide. For example, a servlet can have its server do arbitrary path translations, taking into consideration the server's aliases and virtual paths.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Servlet
Each registered servlet name can have specific initialization (init) parameters associated with it. Init parameters are available to the servlet at any time; they are set in the web.xml deployment descriptor and generally used in init( ) to set initial or default values for a servlet or to customize the servlet's behavior in some way. Init parameters are more fully explained in Chapter 3.
A servlet uses the getInitParameter( ) method for access to its init parameters:
public String ServletConfig.getInitParameter(String name)
This method returns the value of the named init parameter or null if it does not exist. The return value is always a single String. It is up to the servlet to interpret the value.
The GenericServlet class implements the ServletConfig interface and thus provides direct access to the getInitParameter( ) method. This means the method can be called like this:
public void init() throws ServletException {
  String greeting = getInitParameter("greeting");
}
A servlet that needs to establish a connection to a database can use its init parameters to define the details of the connection. We can assume a custom establishConnection( ) method to abstract away the details of JDBC, as shown in Example 4-1.
Example 4-1. Using init Parameters to Establish a Database Connection
java.sql.Connection con = null;

public void init() throws ServletException {
  String host = getInitParameter("host");
  int port = Integer.parseInt(getInitParameter("port"));
  String db = getInitParameter("db");
  String user = getInitParameter("user");
  String password = getInitParameter("password");
  String proxy = getInitParameter("proxy");

  con = establishConnection(host, port, db, user, password, proxy);
}
There's also another more advanced and standard abstraction model available to servlets designed for Java 2, Enterprise Edition (J2EE). See Chapter 12.
A servlet can examine all its init parameters using
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Server
A servlet can find out much about the server in which it is executing. It can learn the hostname, listening port, and server software, among other things. A servlet can display this information to a client, use it to customize its behavior based on a particular server package, or even use it to explicitly restrict the machines on which the servlet will run.
A servlet gains most of its access to server information through the ServletContext object in which it executes. Before API 2.2, the ServletContext was generally thought of as a reference to the server itself. Since API 2.2 the rules have changed and there now must be a different ServletContext for each web application on the server. The ServletContext has become a reference to the web application, not a reference to the server. For simple server queries, there's not much difference.
There are five methods that a servlet can use to learn about its server: two that are called using the ServletRequest object passed to the servlet and three that are called from the ServletContext object in which the servlet is executing.
A servlet can get the name of the server and the port number for a particular request with getServerName( ) and getServerPort( ), respectively:
public String ServletRequest.getServerName()
public int ServletRequest.getServerPort()
These methods are attributes of ServletRequest because the values can change for different requests if the server has more than one name (a technique called virtual hosting). The returned name might be something like www.servlets.com while the returned port might be something like 8080.
The getServerInfo( ) and getAttribute( ) methods of ServletContext provide information about the server software and its attributes:
public String ServletContext.getServerInfo()
public Object ServletContext.getAttribute(String name)
getServerInfo( ) returns the name and version of the server software, separated by a slash. The string returned might be something like
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Client
For each request, a servlet has the ability to find out about the client machine and, for pages requiring authentication, about the actual user. This information can be used for logging access data, associating information with individual users, or restricting access to certain clients.
A servlet can use getRemoteAddr( ) and getRemoteHost( ) to retrieve the IP address and hostname of the client machine, respectively:
public String ServletRequest.getRemoteAddr() 
public String ServletRequest.getRemoteHost()
Both values are returned as String objects. The information comes from the socket that connects the server to the client, so the remote address and hostname may be that of a proxy server. An example remote address might be 192.26.80.118 while an example remote host might be dist.engr.sgi.com.
The IP address or remote hostname can be converted to a java.net.InetAddress object using InetAddress.getByName( ) :
InetAddress remoteInetAddress = InetAddress.getByName(req.getRemoteAddr());
Due to the United States government's policy restricting the export of strong encryption, some web sites must be careful about who they let download certain software. Servlets, with their ability to find out about the client machine, are well suited to enforce this restriction. These servlets can check the client machine and provide links for download only if the client appears to be coming from a permitted country.
In the first edition of this book, permitted countries were only the United States and Canada, and this servlet was written to allow downloads only for users from these two countries. In the time since that edition, the United States government has loosened its policy on exporting strong encryption, and now most encryption software can be downloaded by anyone except those from the "Terrorist 7" countries of Cuba, Iran, Iraq, North Korea, Libya, Syria, and Sudan. Example 4-10 shows a servlet that permits downloads from anyone outside these seven countries.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Sending HTML Information
Content previewBuy PDF of this chapter|