Chapter 9. “DevOps”: Breaking Down Barriers Between Development and Operations

Historically, there has been a strict separation between development and infrastructure. In recent years, the “DevOps” methodology has become very popular, stressing the integration of development and infrastructure operations in order to improve the code deployment process. There are many aspects to DevOps, including close collaboration between development and infrastructure teams, easing the deployment process with automation, and standardizing development and QA environments. This methodology becomes even more important as infrastructure demands increase with new technologies, as development teams move toward rapid-release models (agile/iterative development), and when dealing with distributed teams.

While you may not feel that you need to fully embrace DevOps for your environment, there are still many ideas that stem from DevOps culture that can prove beneficial. For example, ideas that are commonly used in DevOps environments that can help simplify deployments and reduce the chances of regressions when deploying to the production environment include having the ability to track changes to both code and infrastructure and roll back should something not work; maintaining separate (but nearly identical) environments for development, testing, and staging for code and infrastructure changes; and the use of revision control systems and continuous integration systems for code deployment.

This chapter shouldn’t be considered a set of instructions on “how to do DevOps”; instead, it will focus on some of the underlying ideas and technologies that we hit on throughout the rest of the book.

Revision Control Systems

Revision control systems have a bit of an upfront learning curve, but once you get over that, you will never go back to not using one. Generally these days, the argument is not over whether or not to use a revision control system for your code, but which revision control system to use. We’ll stay out of the “which revision control system is best” holy war here, and just stick with Git for our examples since Git is used for Drupal.org and all projects hosted there. If you’re not already sold on a particular revision control system, here are some of the features you should consider when selecting one:

Distributed system versus central system
Distributed systems have become much more popular of late, and with good reason—every copy of the repository stores all files and history locally, which not only speeds up many commands, but allows developers to work offline.
Branching model
Depending on your development workflow and team size, having a revision control system with a powerful branching model can be very useful (even mandatory). This makes collaboration and testing changes very easy.
Performance
Is the system you’ve chosen fast enough for your common tasks?

That said, it’s not especially important which revision control system you use, just that you use one at all! While the trend has been toward distributed systems recently, and those do provide some definite benefits, what’s important is that you choose something that fits in with your workflow and the technical skill level of your team.

Locally Hosted or External Service

One of the first decisions you’ll need to make when implementing a revision control system is whether to host it on a local server or to use an external service (potentially a paid service). There are many services, such as BitBucket and GitHub, that provide hosting for code repositories. Some people prefer to use these services, not only for the ease of use (and setup) they offer, but for the additional features, such as user management and, in the case of GitHub, the easy forking and pull request model, which makes it simple for developers to review and discuss proposed changes. That said, any modern revision control system can be set up locally with just a few commands, so if you don’t want or need the additional features offered by the hosted services, it’s cheap and relatively easy to host yourself.

Not Just for Code

Generally, your actual website code will be the first thing you think about when implementing a revision control system. However, it’s also very useful for other things, such as keeping your system configuration management scripts in a code repository. By storing configuration management or other scripts in a revision control system, you get an automatic log of system changes, the ability to easily collaborate with other developers and/or system administrators, and an easy way to roll back changes.

Configuration Management Systems

Like revision control systems, configuration management systems are something that people generally are reluctant to use at first, but once they get comfortable with them, there is no turning back (in a good way). Configuration management systems allow you to write code (in various languages, depending on the underlying system) to define how a system is configured. This has many benefits; for example, it provides a sort of “live documentation” for your servers, and it means you won’t have 50 manually copied configuration files on a server (httpd.conf.bak, httpd.conf.old, httpd.conf.not.working… don’t pretend you’ve never seen something like that before!). An example of what you can do with a configuration management system is store your custom PHP and Apache configuration files on a web server, and ensure that Apache is running and configured to start on boot. While this may seem like a very simple example, think of what happens when you have multiple servers and you suddenly need to make a configuration change or bring up a new service. What if there are some special commands you use when deploying something manually? It is much better to keep those commands in a configuration management system so that they are documented and not forgotten. Likewise, what happens if one of your servers crashes and needs to be rebuilt? With a configuration management system that becomes a simple task, and you can ensure that everything will be configured as it was before.

Which System to Use

There are many popular configuration management systems. The most widely used are CFEngine, Chef, and Puppet. A relative newcomer is Ansible, which aims to keep configuration management as simple as possible. While these all have the common goal of allowing you to write configuration to define a system, they go about it in slightly different ways, using different languages for their configuration and different underlying programming languages. All of these systems have active communities and development and are fairly well supported by most Linux distributions. We suggest trying them each out and selecting the system that you are most comfortable with.

Pulling It Together: In-Depth Example with Puppet and Git

We mentioned the usefulness of keeping your configuration management scripts in revision control, but what does that look like exactly? For this example, we’ll use Puppet and Git, though it is just as applicable to other configuration management and revision control systems. There are a few pieces involved:

  1. A master repository to which changes will be pushed.
  2. A mechanism to update the scripts on the master server (puppetmasterd) when changes are pushed to the master repository. This could be a post-receive Git hook, or something external such as a Jenkins job.
  3. Local clones of the repository where each developer/administrator can do work and then push it back to the master repository for review and/or implemenation.

For this example, let’s assume we have a utility server that will serve as both the Puppet master server and the host of the master Git repository.

First, we’ll set up a Git repository. This could be at any path you choose, but we’ll go with something under /srv:

# mkdir -p /srv/git/infrastructure.git
# cd /srv/git/infrastructure.git
# git --bare init

The git init command creates an empty Git repository for us. In this case, the --bare flag is used in order to skip creating a working directory; since this is on a server, that’s exactly what we want.

That’s all that’s needed to start an empty Git repository, but we’ll also run a couple of commands to configure the git repository for a shared environment (multiple people contributing to the same repository). Here we assume that you’ve set up an “infrastructure” Linux group—anyone with membership in that group will have read and write access to the repository:

# git config core.sharedrepository 1
# git config receive.denyNonFastforwards true
# chgrp -R infrastructure .
# find . -type d -exec chmod 2770 {} \;

The find/chmod command will set the sgid bit on directories in order to retain group ownership on new files created there—this will help keep permissions correct as people push to the repository.

The Git repo can now be cloned by any user with SSH access to that machine who belongs to the infrastructure group:

$ git clone util.example.com:/srv/git/infrastructure.git

Now that you have a local clone of the repository, you can add some files and push them back to the central repository. For this example we’ll assume we have a Puppet directory tree something like that in Figure 9-1.

Puppet uses manifests/ and modules/ directories

Figure 9-1. Puppet directory structure

When setting up the Pupppet master server, we will use a Git clone of this repository to populate the Puppet manifests/ and modules/ directories. In this case, we’ll configure Puppet to look in /srv/puppet for those files:

# mkdir /srv/puppet
# git clone /srv/git/infrastructure.git /srv/puppet

You’ll need to edit your puppet.conf to point to that directory.

Next, for the automation bit. You don’t want to have to log in to the server, change directories, and do a git pull every time you push changes to the configuration management scripts! Instead, we’ll use a hook in order to update the Puppet master scripts each time new changes are pushed to the repository.

Note

Most revision control systems have the idea of hooks. A hook in this case is a script that gets run before or after a specified action (e.g., before a code push is accepted, or after a code push). Each revision control system has slightly different hooks and names, but the basic idea is the same across them all.

In this case we’ll use the git post-receive hook, which is run after new code is pushed into the repository. To implement this, we need to create the post-receive script in the hooks/ directory inside the git repository (on the server). The file we’ll put there needs to be named post-receive and be an executable script. We’ll use the following script:

#!/bin/sh
cd /srv/puppet
/bin/echo "`/usr/bin/whoami` is updating puppet master config @ `/bin/pwd`"
/usr/bin/git pull

Simply name that file post-receive, copy it into /srv/git/infrastructure.git/hooks/, and ensure that it has executable permissions. Of course, the destination directory needs to have permissions set such that anyone pushing to the git repository can update files there. Following the same permissions as used for the git repository in /srv/git would work well.

To test the new hook, commit some changes to your local Git clone and then push it back to the central repository. You should see the script output informing you that the directory is being updated.

Development Virtual Machines

It can be difficult for developers if they are working in a development environment (think a local laptop) that is set up completely differently than the staging and production environments. There are various ways to work around this—some people choose to set up a development server with Apache virtual hosts for each developer and let them develop on that server directly. However, this has the downside that developers aren’t able to test certain (infrastructure) changes without affecting other developers. What if it were possible to give each developer a local (virtual) environment that closely matched production, and gave the developers the power to test any code and/or infrastructure changes locally? There are a number of ways to do this, but if you already have a configuration management system in place, a tool called Vagrant will provide a very easy solution for creating just such a virtual machine (VM) environment.

Note

Vagrant provides an easy way to create development VMs. It ships with support for VirtualBox, but also has a plug-in system that allows you to use Vagrant with other virtualization providers, such as VMware or AWS. More documentation is available at the official project website.

Typical Vagrant usage is to start with a bare-bones “base box,” which is a virtual machine image with minimal packages and services installed. Vagrant provides support for multiple provisioners, which are the scripts run against the base box in order to configure it to meet your needs. In the most simple form, you could use a set of shell scripts to do the provisioning; however, Vagrant really shines when used with a configuration management system (currently support is provided for Ansible, Chef, and Puppet). If you are already using a configuration management system with your production infrastructure, it is very easy to integrate it into Vagrant in order to create an easily reproducible development environment that closely matches the production server configuration.

Distributing a small base box image and then doing all configuration with a configuration management system provides a few benefits:

  • Initial download of the VM image is faster, since it isn’t very large.
  • Although the initial provisioning step may take a while (and transfer many packages from the Internet), future changes to the VM can be made by updating the configuration management scripts instead of having developers download a full VM image simply to make a few small changes.
  • Infrastructure and configuration management changes can be easily tested on a local Vagrant VM in order to give some assurance that things will work similarly in other (test, staging, production) environments.

How to Distribute Development VMs with Vagrant

Generally, you will want to start with a small base box image, and it should match the operating system you are using on your production infrastructure. You can either create your own (instructions are provided in the Vagrant documentation), or use one of the many base boxes that are publicly available on sites such as http://vagrantbox.es. One important thing to look out for is that you use a base box that includes support for whichever configuration management provider you will use. This simply means that, for example, Puppet is installed on the VM image if you are going to be using Puppet for provisioning.

Once you’ve settled on a base box, you can start integrating your existing configuration management system. Most things will just work if you’ve done a good job of writing your configuration management scripts; however, since the Vagrant image is starting out mostly unconfigured, it’s very important that your dependency order is set correctly for everything so that all services will work correctly after one run of the provisioning scripts. Depending on how you are doing code and database deployments in your production environment, you may need to create additional scripts for deploying the site code onto the VM—for example, a “sitedeploy” Puppet module that gets a copy of the site code from your configuration management system, imports initial data into the database, and ensures that the Drupal database user is granted correct permissions.

Now, distributing the system to all developers and admins becomes a matter of distributing a copy of your Vagrantfile (Vagrant configuration directives) and the configuration management scripts. The Vagrantfile can automate the downloading of a standard base box image. Generally we keep all the configuration in a revision control system so that it’s easy to make updates and everyone can pick them up with a git pull or similar.

Deployment Workflow

Now that your developers can quickly provision local virtual machines that closely match the production configuration, it’s time to take advantage of them to test changes to Drupal. We’ve mentioned the importance of a revision control system for managing your site’s code, but it doesn’t stop there. Once you have a revision control system in place, you will also need a well-defined workflow in place for your developers, as well as for code deployments to individual environments. A well-defined workflow will improve site stability (you are testing code first and not editing directly on your production site) and should allow developers to easily collaborate on changes for upcoming releases, while still giving the ability to make quick “hotfixes” when a bug is found that requires immediate attention. There are a number of widely used and accepted workflows, and generally there is no wrong way as long as you find something that works for you and everyone on your team agrees to stick to it.

Example Workflow with Git

As we mentioned, there are almost endless options for how to approach your development and deployment workflow. We’ll give an example consisting of three environments: development, staging, and production. This is a standard setup that we strongly recommend. The Git branching model described here is based on a workflow initially written about by Vincent Driessen and referred to as “Git Flow,” which is also the name of the set of optional Git plug-ins used to easily work in the model. For the sake of simplicity, we won’t use so-called release branches; however, some people will find those very useful and should read Vincent’s full article for more information.

There are a couple of things to consider here: the Git branching workflow, and the deployment workflow. The deployment workflow is easy to understand from a high level—we want new code to start in the development environment, then be pushed to staging for testing before finally being pushed to production when it’s deemed ready. Figure 9-2 illustrates this workflow.

Code flows from Development to Staging to Production

Figure 9-2. Code deployment workflow

We’ll get more in depth into the code deployment process in the next section. For now, it’s only important to understand that code changes flow in only one direction, except in the case of hotfixes, and that code is never edited directly on the servers, but is instead always pushed through the revision control system (although some sites make an exception for this rule when working in the development environment, with the caveat that developers clean up after themselves so that the automated deployment tools continue to work as expected).

There are a number rules and guidelines that shape the development workflow and Git branching model:

  • Production is run from the master branch (specifically, from a Git tag pointing to a commit on the master branch).
  • Staging is run from tags created from the develop branch.
  • Development is run from the HEAD of the develop branch.
  • Developers create branches off of the develop branch to do their work, and those are merged back into develop once they’ve been reviewed.
  • If a hotfix is ever needed, a hotfix branch is created off of master and then merged into develop for testing (and inclusion in the next full release). After testing, it is merged back into master for deployment to production.

This gives us three distinct environments, with active development happening in the development environment, testing and QA happening in the staging environment, and code being pushed to the production environment once QA is complete in the staging environment.

To see exactly how this all works together, we can look at a specific developer branch as it makes its way through the workflow and eventually is deployed to the production environment:

  1. The developer creates a branch (we’ll call the branch “feature-go-faster”) off of the current develop branch.
  2. Code is committed to the developer’s branch, feature-go-faster, and the branch is pushed to the central repository for review.
  3. The feature-go-faster branch is reviewed, and once it passes review it is merged into the main develop branch.
  4. Code on the development web server is updated with the latest code in the develop branch. This may also involve syncing the current database and files from staging or production back into the development environment.
  5. Basic (or extensive, if that’s your thing) testing is performed in the development environment. At some schedule—which could be weekly, biweekly, monthly, or just “when there are enough changes to warrant a new release”—a release tag is created pointing to the current state of the develop branch.
  6. The new release tag is deployed to the staging environment. In many cases, this also involves syncing the current database and files from the production environment into the staging environment.
  7. Testing is performed in the staging environment.
  8. Once testing is complete, the code from that tag is merged into the master branch and a new release tag is generated.
  9. The new release tag is deployed to the production environment and any final QA testing is performed there.

This design can be considered a starting point that can be adapted to fit your specific needs. Smaller sites may choose to combine the development and staging environments since they have fewer developers and changes, or potentially fewer testing requirements. Other sites may add additional testing or maintenance environments.

Deployment with Jenkins CI

As demonstrated in the preceding workflow example, there are a few interactions between the various environments that need to happen during deployment. This includes the actual code deployment (usually some sort of git pull or git checkout command); syncing files and the database back from production or staging; and potentially some additional tasks such as running database updates, clearing the cache, taking a database snapshot, or even rebuilding Solr search indexes. There are many ways that these tasks can be run, but one popular option is to use Jenkins, which can connect over SSH (or with Jenkins “slaves”) to the various server environments and run shell scripts, drush commands, etc. on each server as needed. Using a continuous integration (CI) server such as Jenkins provides a number of benefits, such as job health and history tracking, job scheduling (time- or event-based), a fine-grained permissions system, and a web user interface for those that aren’t comfortable running scripts from the command line.

Note

Other popular deployment options include Capistrano or, if using Chef, the built-in Chef deploy resource.

It’s advisable to run Jenkins from an internal utility server if you have one. It’s important that access to Jenkins is limited as much as possible because once it’s configured, it will have access to make changes to all of your site environments, including the production site. Access can be restricted with the Jenkins users/permissions system; in addition, access can be controlled with a firewall, or Jenkins can be configured to listen only on the local interface, requiring users to use SSH port forwarding in order to connect to the Jenkins web interface.

Note

SSH port forwarding is a useful trick for connecting to Jenkins, as well as other services that may be protected behind a firewall. Forwarding ports over SSH can be done with the -L flag. For example, if Jenkins is listening on port 8080 only on the local interface on the server, you could use ssh -L 8080:localhost:8080 servername.com and then access Jenkins by pointing your browser to http://localhost:8080.

In a simple setup, the jenkins user on the server running the Jenkins service can be given an SSH key pair. Then, on each server that Jenkins needs to access (generally this would be limited to only your web servers, but it depends on exactly what you are configuring Jenkins to do), a local user can be created granting login access with the jenkins user’s public key. In this manner, jobs can be configured to use drush (with a properly configured drush aliases file), or by calling SSH directly:

'ssh deploy@webhost /usr/local/bin/deploy_script.sh'

The deploy user on the web servers can be given any username you like. In some cases, the account may need access to run scripts as another user—for example, running a cache clear as the apache (or www-data) user so that it has access to remove files from the Drupal files/ directory. Those commands should be configured in sudo as needed. The following is an example file that can be placed in /etc/sudoers.d/deploy:

Defaults:deploy !requiretty
Defaults:deploy !env_reset
deploy ALL=(apache) NOPASSWD: /usr/local/bin/site_cache_clear.sh

Note that the requiretty option must be disabled for the user that Jenkins is connecting as, since it will not be running from a valid terminal.

There are a number of different types of scripts that are typically run from Jenkins:

Code deployment
These are scripts that connect to a revision control system to update/deploy code onto the web servers.
Deployment helpers
These scripts are for handling tasks related to a deployment—for example, taking a database snapshot before an update, putting the site into maintenance mode, performing a cache clear, performing database updates, etc.
Environment synchronization
These scripts are for syncing the Drupal files/ directory and the database between environments. While code is deployed from development to staging to production, database and files/ sync happens in reverse: the production database and files/ are synchronized to staging, and then from there to development.
Site management
These are periodic scripts that support the site—for example, running Drupal cron, daily database backups, etc.

Get High Performance Drupal now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.