Skip to Content
Web Performance Daybook Volume 2
book

Web Performance Daybook Volume 2

by Stoyan Stefanov
June 2012
Intermediate to advanced content levelIntermediate to advanced
226 pages
5h 5m
English
O'Reilly Media, Inc.
Content preview from Web Performance Daybook Volume 2

Chapter 15. Using Intelligent Caching to Avoid the Bot Performance Tax

Matthew Prince

In 2004, Lee Holloway (https://twitter.com/icqheretic) and I started Project Honey Pot (http://www.projecthoneypot.org/). The site, which tracks online fraud and abuse, primarily consists of web pages that report the reputation of IP addresses. While we had limited resources and tried to get the most of them, I just checked Google which lists more than 31 million pages in its index that make up the www.projecthoneypot.org (http://www.projecthoneypot.org/) site.

Project Honey Pot’s pages are relatively simple and asset-light, but like many sites today they include significant dynamic content that is regularly updated at unpredictable intervals. To deliver near realtime updates, the pages need to be database driven.

To maximize performance of the site, from the beginning we used a number of different caching layers to store the most frequently accessed pages. Lee, whose background is high-performance database design, studied reports from services like Google Analytics to understand how visitors moved through the site and built caching to keep regularly accessed pages from needing to hit the database.

We thought we were pretty smart but, in spite of following the best practices of web application performance design, with alarming frequency the site would grind to a halt. The culprit turned out to be something unexpected and hidden from the view of many people optimizing web performance: automated bots. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Jump Start Web Performance

Jump Start Web Performance

Craig Buckler
High Performance Mobile Web

High Performance Mobile Web

Maximiliano Firtman
Back-end Performance

Back-end Performance

Bruno Skvorc, Christopher Pitt, Tonino Jankov, Reza Lavaryan, Daniel Berman

Publisher Resources

ISBN: 9781449337667Catalog PageErrata