Drupal has come a long way since becoming an open source project in 2001. What was once a fairly limited content management system has become a very powerful framework that runs millions of websites. Everything from personal blogs and small neighborhood businesses to Internet startups, universities, governments, and global companies are running Drupal. There are hundreds of Drupal-focused companies offering development, hosting, and performance tuning services, and new Drupal sites, small and large, are coming online everyday.
The three of us authors all work at Tag1 Consulting, where we focus specifically on the performance and scalability of Drupal websites. If there is one question we see asked more than any other, it’s, “Does Drupal scale?” The question may be asked in many different forms: “I want to do X (insert super dynamic, cool feature here), and it needs to support millions of users”; “We’re thinking of using Drupal for this project, but we hear that using Views is terribly slow”; or focusing on the infrastructure components, “We’re confident in Drupal, but pretty sure that MySQL can’t keep up with our traffic.” In the end, it all boils down to, “Can Drupal scale?” because when we say “Drupal” in this context, we actually mean the entire stack of infrastructure and software that supports a Drupal site. The short answer is, of course, “Yes,” but if it were that simple, this book could start and end with this introduction. As you might expect, the actual answer of how to achieve performance while scaling up a large Drupal site is much more complicated.
We deal with clients of all types and with many varying needs. We repeatedly see many of the same issues arise: pages aren’t caching properly, servers are overloaded, database queries are running too slowly. All of these issues contribute to the overall question of whether and how Drupal can scale. While it would be impossible to cover all the possible reasons for any potential problems in a single book, the best practices and guidance provided here will cover the most common problems encountered while scaling Drupal websites. We provide a strong base of knowledge that can be used to plan for and overcome more difficult or unique performance issues.
The primary goal of this book is to help you solve Drupal performance and scalability issues. Drupal makes creating websites incredibly easy; however, if you aren’t careful, it can quickly turn into a performance nightmare. This book is full of information on best practices for running a high performance Drupal site. This is not just limited to “enable these performance settings in the Drupal configuration”; rather, we take a holistic approach to website performance, covering Drupal internals, coding, and infrastructure techniques that all come together to build a high-performing and scalable website.
This is a technical book providing in-depth explanations and examples of common methods used to improve Drupal site performance. It is expected that readers of this book have a basic understanding of Drupal and the LAMP stack and are familiar with common hardware and infrastructure concepts. We’ve designed this book to be useful to both developers and system administrators. A site cannot perform at a high level unless attention is given to both code and infrastructure.
The main focus of the book will be on Drupal versions 7 and 8, with Drupal 8 planned for release shortly after this book goes to press. There are still many websites running Drupal 6, and while our infrastructure advice and examples are still very relevant for older versions of Drupal, be aware that the code examples and discussion of Drupal internals have generally changed for the newer versions of Drupal.
We cover a wide range of topics within this book and have grouped them into the following topical sections.
Chapter 1, Beginning a Performance Project, discusses the various aspects of a website that all contribute to the big picture of website performance. Here, we also introduce a process for analyzing websites and approaching performance improvement projects.
This section covers a wide variety of Drupal application performance issues, starting with Chapter 2, Frontend Performance, where we describe best practices for frontend optimization, looking at network utilization, code optimization, and issues specific to mobile performance.
We go into more depth on code-level optimizations in Chapter 4, Drupal Coding for Optimal Performance. This chapter covers important issues that should be addressed when writing or extending custom code in Drupal, giving best practices for items such as entities, the cache API, and the use of queues and workers. On the flip side, Chapter 5, Drupal Coding for Abysmal Performance, talks about common pitfalls that should be avoided, and explains why certain code can greatly reduce website performance.
Chapter 6, Verifying Changes, outlines the importance of tracking performance metrics for a site and using that information to understand how changes to the site affect performance for better or worse.
We begin the section on infrastructure issues with Chapter 7, Infrastructure Design and Planning, which describes best practices for designing and infrastructure to host a Drupal website and related services. Early planning of infrastructure design will help a website to easily scale as it grows.
Chapter 8, Service Monitoring, covers how to monitor services and infrastructure in order to be alerted of potential issues before they affect a website and how to track performance and usage baselines in order to better understand how services react under load.
Chapter 9, “DevOps”: Breaking Down Barriers Between Development and Operations, introduces many common infrastructure ideas and best practices to break down barriers between development and operations. This chapter discusses revision control systems, system configuration management, deployment workflow, and development virtual machines.
Chapter 10, File Storage for Multiple Web Servers, analyzes the difficulties faced with sharing a single Drupal files/ directory between multiple web servers and gives examples of common file sharing options including NFS, rsync, and GlusterFS.
Chapter 11, Drupal and Cloud Deployments, introduces the idea of virtualized hosting and cloud-based infrastructures. Here we discuss the performance and scalability benefits of using a virtualized infrastructure, as well as some of the trade-offs between using virtual servers as opposed to physical servers.
Chapter 12, Failover Configuration, explains how to provide highly available services, using technologies such as Heartbeat to handle failover when a service goes offline.
Chapters 13, 14, and 15 all cover MySQL database information related to Drupal. Chapter 13, MySQL, provides an in-depth look at MySQL performance considerations and general configuration settings. It also contains an introduction to MySQL storage engines, with specific focus on InnoDB for performance and scalability. Chapter 14, Tools for Managing and Monitoring MySQL, introduces a number of tools commonly used for tuning, managing, and monitoring MySQL servers. Chapter 15, MySQL Query Optimization, wraps up the MySQL discussion by focusing on methods for locating and optimizing slow queries.
Chapter 16, Alternative Storage and Cache Backends, describes how alternative database and data storage engines can be used with Drupal to improve performance. This chapter includes examples on how to implement Memcache, Redis, and MongoDB backends with Drupal.
Chapter 17, Solr Search, discusses using Solr as an alternative search option for Drupal. We look at some of the benefits and added functionality that can be achieved by shifting the search backend out of MySQL.
For an optimally performing site, it’s important to have a properly tuned web server. Chapter 18, PHP and httpd Configuration, discusses how to best configure the web server and PHP for a Drupal website. httpd.conf is nothing to be scared of—we cover thread settings, keepalive, logging, and other useful configuration options for Apache httpd. This chapter also discusses PHP configurations and the importance of using an opcode cache.
Chapter 19, Reverse Proxies and Content Delivery Networks, introduces the concept of using a reverse proxy to cache website content. We give detailed examples of how to use Varnish with Drupal, including specific Varnish Configuration Language (VCL) configurations that can dramatically increase website performance. This chapter also covers content delivery networks (CDNs) and explains options for integrating Drupal with a CDN.
One important lesson in this book is that website performance is not a one-time task; it’s something that needs to be done continually in order to have a website perform at its best and be able to scale to meet increasing traffic needs. Chapter 20, Load Testing, discusses load testing tools and the importance of ongoing testing in order to catch performance issues before they become major problems.
Wrapping up the book, Chapter 21, Where to Next?, provides some external resources to extend upon ideas presented in the book.
The following typographical conventions are used in this book:
Constant width bold
Constant width italic
This icon signifies a tip, suggestion, or general note.
This icon indicates a warning or caution.
Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/tag1consulting/high-performance-drupal.
This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.
We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “High Performance Drupal by Jeff Sheltren, Narayan Newton, and Nathaniel Catchpole (O’Reilly). Copyright 2014 Tag1 Consulting, 978-1-449-39261-1.”
If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at firstname.lastname@example.org.
Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.
Safari Books Online offers a range of product mixes and pricing programs for organizations, government agencies, and individuals. Subscribers have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and dozens more. For more information about Safari Books Online, please visit us online.
Please address comments and questions concerning this book to the publisher:
|O’Reilly Media, Inc.|
|1005 Gravenstein Highway North|
|Sebastopol, CA 95472|
|800-998-9938 (in the United States or Canada)|
|707-829-0515 (international or local)|
We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://oreil.ly/HP-Drupal.
To comment or ask technical questions about this book, send email to email@example.com.
For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
This book has been quite an undertaking for all of us, and we couldn’t have done it without the help and support of many people. First of all, thanks to all of the Drupal contributors who have made Drupal into the amazing platform it is today. Specifically, we would like to thank our wonderful technical editors for their thoughtful reviews and ideas: Fabian Franz, Rudy Grigar, and Mark Sonnabaum.
We’d also like to give a special thanks to Jeremy Andrews for his endless hours spent reviewing the book, for providing ideas for concepts to cover, and for constantly pushing us to provide better explanations for things we may take for granted. His encouragement and ongoing assistance with all aspects of the book were absolutely priceless. The book would not be anywhere as good as it is without him.
We also need to thank Tag1 Consulting, our employer, for providing us the flexibility to work on the book over such a long period of time. And thanks as well to Meghan Blanchette, our O’Reilly editor, for pushing for us to write this book, and for putting up with our seemingly endless delays.
First and foremost I need to thank my wife, Sara, for being so supportive and encouraging throughout this process, and also for her understanding throughout all of the late nights and weekends I spent cooped up in the office writing. Thanks also to all my family and friends for your support and excitement about the book, in spite of the fact that it does not involve a zombie apocalypse.
This book was a true collaborative effort, and I really appreciate the hard work done by Narayan and Nat, who both brought their amazing expertise and insight. I can’t even imagine how Nat was able to write so much content for the book even as he was in the midst of the Drupal 8 release as the branch maintainer.
Firstly I need to thank Jeff, who was the major motivation for getting this book done and the driving force to keep it moving forward. Secondly, I must thank my very tolerant wife, Candice, who somehow didn’t get too upset at the concept of us doing just one more thing. Lastly, we all very much thank Jeremy, Peta, and all of our coworkers at Tag1 Consulting for creating the time for us to work on this.
Massive thanks go to my wife Shoko and daughter Amile for putting up with yet another Drupal project, Jeff for keeping the book on track, Tag1 Consulting for interesting consulting projects that allow me to spend more time on these issues than is probably healthy, and all of the Drupal core and contributed module contributors for working on the software that both runs into these issues and also attempts to solve them.