In most three-tier web database systems, the majority of the application logic is in the middle tier. The client tier presents data to and collects data from the user; the database tier stores and retrieves the data. The middle tier serves most of the remaining roles that bring together the other tiers: it drives the structure and content of the data displayed to the user, and it processes input from the user as it is formed into queries on the database to read or write data. It also adds state management to the HTTP protocol. The middle-tier application logic integrates the Web with the database management system.
In the application framework used in this book, the components of the middle tier are a web server, a web scripting language, and the scripting language engine. A web server processes HTTP requests and formulates responses. In the case of web database applications, these requests are often for programs that interact with an underlying database management system. The web server we use throughout this book is the Apache Software Foundation’s Apache HTTP server, the open source web server used by more than 60% of Internet connected computers.
We use the PHP scripting language as our middle-tier scripting language. PHP is an open source project of the Apache Software Foundation and, not surprisingly, it is the most popular Apache HTTP server add-on module, with around 40% of the Apache HTTP servers having PHP capabilities. PHP is particularly suited to web database applications because of its integration tools for the Web and database environments. In particular, the flexibility of embedding scripts in HTML pages permits easy integration with the client tier. The database-tier integration support is also excellent, with more than 15 libraries available to interact with almost all popular database management systems.
Web servers are often referred to as HTTP servers. The term “HTTP server” is a good summary of their function: their basic task is to listen for HTTP requests on a network, receive HTTP requests made by user agents (usually web browsers), serve the requests, and return HTTP responses that contain the requested resources.
There are essentially two types of request made to a web server: the first asks for a file—often a static HTML web page or an image—to be returned, and the second asks for a program to be run and its output to be returned to the user agent. Simple requests for files are further discussed in Appendix B.
Requests for web scripts that access a database are examples of HTTP requests that require a server to run a program. With the software used in this book, the HTTP requests are for PHP script resources, which require that the PHP Zend engine be run, a script retrieved and processed, and the script output captured.
The installation and configuration of Apache for most web database applications is straightforward. A concise installation guide for the Linux operating system is presented in Appendix A. Apache can be downloaded from http://www.apache.org; other Apache resources are listed in Appendix E.
Apache is fast and scalable. It can handle simultaneous requests from user agents and is designed to run under multitasking operating systems, such as Linux and 32-bit variants of Microsoft Windows. It’s also lightweight, has low per-process requirements, can effectively handle changes in request loads, and can run fast on even modest hardware.
Apache—at least conceptually—isn’t complicated. The web server is actually several processes, where one process coordinates the others. The coordinating process usually runs with the permissions of the superuser or root user on a Unix machine and doesn’t serve requests itself. The other processes, which usually run as more secure, permissionless users, notify their availability to handle requests to the coordinating server. If too few servers are available to handle incoming requests, the coordinating server may start new servers; if too many are free, it may kill spare servers to save resources.
How Apache listens on the network and serves requests is controlled by its configuration file. The server administrator controls the behavior of Apache through more than 150 directives that affect resource requirements, response time, flexibility in dealing with request load variability, security, how HTTP requests are handled and logged, and most other aspects of its operation. Careful adjustment of these parameters is important for performance, and more details of Apache configuration can be found in the resources listed in Appendix E.
Version 1.3 of Apache has some limitations that will be addressed in Version 2.0. Version 2.0 is available for download, but at the time of writing remains in the beta-testing phase. Only around 20 sites are known to be using the beta version.
The significant enhancements in Apache 2.0 are:
Use of lighter-weight processes or threads in conjunction with the process model on the older versions. This will most likely offer significant performance improvement in starting new servers and reduce the overall memory requirements of running servers.
Better support, performance, and stability on non-Unix machines.
Addition of filtering modules so that data can be modified as it is processed by the web server.
Support for IPv6, the new version of the IP protocol in the TCP/IP networking suite.
PHP has emerged as a component of many medium- and large-scale web database applications. This isn’t to say that other scripting languages don’t have excellent features. However, there are many reasons that make PHP a good choice, including:
PHP is open source, meaning it is entirely free. As such, community efforts to maintain and improve it are unconstrained by commercial imperatives.
One or more PHP scripts can be embedded into static HTML files and this makes client-tier integration easy. On the down side, this can blend the scripts with the presentation; however the template techniques described in Chapter 13 can solve most of these problems.
There are over 15 libraries for native, fast access to the database tier.
Fast execution of scripts. With the new innovations in the Zend engine for script processing, execution is fast, and all components run within the main memory space of PHP (in contrast to other scripting frameworks, in which components are in distinct modules). Empirical evidence suggests that for tasks of at least moderate complexity, PHP is faster than other popular scripting tools.
Platform and operating-system flexibility. Apache runs on many different platforms and under selected operating systems; PHP runs on all these and more when integrated with other web servers.
PHP is suited to complex systems development. It is a fully featured programming language, with more than 50 function libraries.
The current version of PHP is Version 4—we call this PHP throughout most of this book—and the current release at the time of writing is PHP 4.0.6.
PHP4 represents a complete rewrite of the underlying scripting engine used in PHP3. The significant difference is a change in the model used to run scripts with the scripting engine. The PHP3 scripting engine was an interpreter. Each line of code in a script was read, parsed, and executed. If a statement in the body of a loop is executed 100 times, the line of code is reinterpreted 100 times using PHP3. This model is slow for complex scripts, but fast for short scripts.
The PHP4 script-processing model is different and designed for larger applications. A script is read, parsed, and compiled into an intermediate format, and then the intermediate code is executed by the PHP4 Zend engine script executor. This means that each line in the script is interpreted from its raw form only once, even if it is executed hundreds of times. Moreover, compilation allows optimization of code segments. The result is a performance improvement in PHP4 for all but very simple scripts.
The architecture of the PHP4 scripting environment is shown in Figure 1-3 (image from Zend Technologies Inc.). As shown, PHP4 is a module of the web server software. The PHP software itself is divided into two components: the function libraries or modules, and the Zend engine.
When a user agent makes a request to the web server for a PHP script, six steps occur:
The web server passes the request to the Zend engine’s web server interface.
The web server interface calls the Zend engine and passes parameters to the engine.
The PHP script is retrieved from disk by the engine.
The script is compiled by the runtime compiler.
The compiled code is run by the engine’s executor and may include calls to function modules. The output of the executor is returned to the web server interface.
The web server interface returns output to the web server (which, in turn, returns the output as an HTTP response to the user agent).
How the PHP scripting engine is managed and run depends on how the
PHP module is included in the Apache web server installation process.
In the instructions provided in Appendix A, the PHP
module library is statically linked with the
httpd binary executable. This means that
the PHP scripting engine is loaded into main memory when Apache runs,
making the PHP engine run faster. The drawbacks are that Apache with
a static PHP library consumes more memory than if the module is
loaded dynamically, and that the module upgrade
process is less flexible.
Pointers to web resources, books, and commercial products for PHP development are listed in Appendix E.
 From the Security Space web server survey, Apache module report, http://www.securityspace.com/s_survey/data/index.html (April 2001).