Chapter 22. Databases

The explosion of the Web was partly driven by relatively cheap and easy global Internet access to legacy databases. Most of this information is on mainframes or in relational database management systems (RDBMSs).

There are three standard classes of database access, each with different requirements:

  • The individual query of a read-only database, such as AltaVista.

  • The very complex query looking for patterns in huge amounts of data, usually for marketing purposes. This is called data mining. In a famous example of data mining, grocery stores correlated sales of all items and found that beer and diapers were often sold together. No one had previously suspected this, but it made sense because both are items that you run out of and may make a special trip to buy. As a result of this discovery, grocery stores now tend to keep beer and diapers close together. Data mining is read-only, and the queries are usually so complex and take so long to run that public web access is not advisable.

  • Transaction processing, such as online credit card verification and sales, or bank account access. Transaction processing is rapidly becoming a key area of value for the Web.

These three classes of database access vary in their scalability needs and abilities. The read-only simple access class is easily scaled by replicating the database. Data mining databases generally do not need to be scaled, because so few users will be making queries. Transaction processing databases are ...

Get Web Performance Tuning, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.