Introduction

This book will teach you how to install and use Puppet for managing computer systems. It will introduce you to how Puppet works, and how it provides value. To better understand Puppet and learn the basics, you’ll set up a testing environment, which will evolve as your Puppet knowledge grows. You’ll learn how to declare and evaluate configuration policy for hundreds of nodes.

This book covers modern best practices for Puppet. You’ll find tips throughout the book labeled Best Practice.

You’ll learn how to update Puppet 2 or Puppet 3 puppet code for the increased features and improved parser of Puppet 4. You’ll learn how to deploy the new Puppet Server. You’ll have a clear strategy for upgrading older servers to Puppet 4 standards. You’ll learn how to run Puppet services over IPv6 protocol.

Most important of all, this book will cover how to scale your Puppet installation to handle thousands of nodes. You’ll learn multiple strategies for handling diverse and heterogenous environments, and reasons why each of these approaches may or may not be appropriate for your needs.

What Is Puppet?

Puppet manages configuration data, including users, packages, processes, and services—in other words, any resource of the node you can define. Puppet can manage complex and distributed components to ensure service consistency and availability. In short, Puppet brings computer systems into compliance with a configuration policy.

Puppet can ensure configuration consistency across servers, clients, your router, and that computer on your hip. Puppet utilizes a flexible hierarchy of data sources combined with node-specific data to tune the policy appropriately for each node. Puppet can help you accomplish a variety of tasks. For example, you can use it to do any of the following:

  • Deploy new systems with consistent configuration and data installed.
  • Upgrade security and software packages across the enterprise.
  • Roll out new features and capabilities to existing systems painlessly.
  • Adjust system configurations to make new data sources available.
  • Decrease the cost and effort involved in minor changes on hundreds of systems.
  • Simplify the effort and personnel involved in software deployments.
  • Automate buildup and teardown of replica systems to test proposed changes.
  • Repurpose existing computer resources for a new use within minutes.
  • Gather a rich data set of information about the infrastructure and computing resources.
  • Provide a clear and reviewable change control mechanism.

Twenty years ago, people were impressed that I was responsible for 100 servers. At a job site last year, I was responsible for over 17,000 servers. At my current job I’d have to go check somewhere to find out, as we scale up and down dynamically based on load. These days, my Puppet code spins up more servers while I’m passed out asleep than I did in the first 10 years of my career. You can achieve this only by fully embracing what Puppet provides.

I was recently reminded of something I quipped to a CEO almost six years ago:

You can use Puppet to do more faster if you have ten nodes.
You must use Puppet if you have ten hundred nodes.

Puppet enables you to make a lot of changes both quickly and consistently. You don’t have to write out every step, you only have to define how it should be. You are not required to write out the process for evaluating and adjusting each platform. Instead, you utilize the Puppet configuration language to declare the final state of the computing resources. Thus, we describe Puppet as declarative.

Why Declarative

When analyzing hand-built automation systems, you’ll invariably find commands such as the following:

$ echo "param: newvalue" >> configuration-file

This command appends a new parameter and value to a configuration file. This works properly the first time you run it. However, if the same operation is run again, the file has the value twice. This isn’t a desirable effect in configuration management. To avoid this, you’d have to write code that checks the file for the configuration parameter and its current value, and then makes any necessary changes.

Language that describes the actions to perform is called imperative. It defines what to do, and how to do it. It must define every change that should be followed to achieve the desired configuration. It must also deal with any differences in each platform or operating system.

When managing computer systems, you want the operations applied to be idempotent, where the operation achieves the same results every time it executes. This allows you to apply and reapply (or converge) the configuration policy and always achieve the desired state.

To achieve a configuration state no matter the existing conditions, the specification must avoid describing the actions required to reach the desired state. Instead, the specification should describe the desired state itself, and leave the evaluation and resolution up to the interpreter.

Language that declares the final state is called declarative. Declarative language is much easier to read, and less prone to breakage due to environment differences. Puppet was designed to achieve consistent and repeatable results. Every time Puppet evaluates the state of the node, it will bring the node to a state consistent with the specification.

How Puppet Works

Any node you control contains an application named puppet agent. The agent evaluates and applies Puppet manifests, or files containing Puppet configuration language that declares the desired state of the node. The agent evaluates the state of each component described in a manifest, and determines whether or not any change is necessary. If the component needs to be changed, the agent makes the requested changes and logs the event.

If Puppet is configured to utilize a centralized Puppet server, Puppet will send the node’s data to the server, and receive back a catalog containing only the node’s specific policy to enforce.

Now you might be thinking to yourself, “What if I only want the command executed on a subset of nodes?” Puppet provides many different ways to classify and categorize nodes to limit which resources should be applied to which nodes. You can use node facts such as hostname, operating system, node type, Puppet version, and many others. Best of all, new criteria custom to your environment can be easily created.

The Puppet agent evaluates the state of only one node. In this model, you can have agents on tens, hundreds, or thousands of nodes evaluating their catalogs and applying changes on their nodes at exactly the same time. The localized state machine ensures a scalable and fast parallel execution environment.

Why Use Puppet

As we have discussed, Puppet provides a well-designed infrastructure for managing the state of many nodes simultaneously. Here are a few reasons to use it:

  • Puppet utilizes a node’s local data to customize the policy for each specific node: hundreds of values specific to the node—including hostname, operating system, memory, networking configuration, and many node-specific details—are used to tune the policy appropriately.
  • Puppet agents can handle OS-specific differences, allowing you to write a single manifest that will be applied by OS-specific providers on each node.
  • Puppet agents can be invoked with specific tags, allowing a filtered run that only performs operations that match those tags during a given invocation.
  • Puppet uses a decentralized approach where each node evaluates and converges its own Puppet catalog separately. No node is waiting for another node to complete.
  • Puppet agents report back success, failure, and convergence results for each resource, and the entire run.
  • Orchestration systems such as the Marionette Collective (MCollective) can invoke and control the Puppet agent for instantaneous large-scale changes.

In Part I, you will learn how to write simple declarative language that will make changes only when necessary.

In Part II, you will create a module that uses Puppet to install and configure the Puppet agent. This kind of recursion is not only possible but common.

In Part III, you will learn how to use Puppet masters and Puppet Server to offload and centralize catalog building, report processing, and backup of changed files.

In Part IV, you will use MCollective to orchestrate immediate changes with widespread Puppet agents.

Puppet provides a flexible framework for policy enforcement that can be customized for any environment. After reading this book and using Puppet for a while, you’ll be able to tune your environment to exactly your needs. Puppet’s declarative language not only allows but encourages creativity.

Is Puppet DevOps?

While Puppet is a tool used by many DevOps teams, using Puppet will not by itself give you all the benefits of adopting DevOps practices within your team.

In practice, Puppet is used by many classic operations teams who handle all change through traditional planning and approval processes. It provides them with many benefits, most especially a readable, somewhat self-documenting description of what is deployed on any system. This provides tremendous value to operations where change management control and auditing are primary factors.

On the other hand, the ability to manage rapid change across many systems that Puppet provides has also cracked open the door to DevOps for many operations teams. Classic operations teams were often inflexible because they were responsible for the previously difficult task of tracking and managing change. Puppet makes it possible to not only track and manage change, but to implement locally customized change quickly and seamlessly across thousands of nodes. This makes it possible for operations teams to embrace practices that are flexible for changing business needs.

You don’t have to be working with developers to utilize DevOps practices. This is a common misconception of DevOps. The developer in DevOps is not a different person or team, it is you! There are many teams which utilize DevOps practices that don’t support developers; rather, they manage systems that support a completely different industry. You are participating in DevOps when you utilize Agile development processes to develop code that implements operations and infrastructure designs.

Perhaps the biggest source of confusion comes when people try to compare using Puppet (a tool) to implement DevOps practices, and the idea of “valuing individuals and interactions over processes and tools.” It is easiest to explain this by first outlining the reasons that operations teams were historically perceived as inflexible. The tools available for managing software within the enterprise used to be shockingly limited. Many times a very small change, or a customization for one group, would require throwing away the software management tool and embracing another. That’s a tremendous amount of work for an operations team.

When using Puppet, if an individual can make a case for the value of a change (interaction), then rolling out a local customization usually involves only a small refactoring of the code. Applying changes becomes easy, and thus avoids a conflict between valuable requests and the limitations of the tools available. Discussion of the merits of the change has higher value than processes used to protect operations teams from unmanageable change.

No tool or set of tools, product, or job title will give an operations team all the benefits of utilizing DevOps practices. I have seen far too many teams with all of the keywords, all of the titles, and none of the philosophy. It’s not a product, it’s an evolving practice. You don’t get the benefits without changing how you think.

I highly recommend making the effort to fully understand both Agile processes and DevOps practices and methodologies. Don’t skimp. Someone on your team who is good at promoting and guiding others should be fully trained on Agile processes. Get the people responsible for creating change out to conferences where they can learn from others’ experiences, such as DevOps Days, PuppetConf, and O’Reilly’s Velocity conference.

That’s not an obligatory push of the publisher’s conference. Many people consider John Allspaw’s “10+ Deploys Per Day: Dev and Ops Cooperation,” which was presented at Velocity 2009, to be a founding cornerstone of the DevOps movement. Velocity was the first large conference to add DevOps items to its agenda, and DevOps practices are a major focus of the conference today.

Puppet is a great tool, and you’ll find that it’s even more valuable when used with DevOps practices. While this is not a book about DevOps, it will present many useful tools and strategies for practicing DevOps in your organization.

Time to Get Started

As we proceed, this book will show you how Puppet can help you do more, and do it faster and more consistently than ever before. You’ll learn how to extend Puppet to meet your specific needs:

  • You’ll install Puppet and get it working seamlessly to control files, packages, services, and the Puppet daemon.
  • You’ll discover an active community of Puppet developers who develop modules and other Puppet plugins on the Puppet Forge and GitHub.
  • You’ll build your own custom fact. You’ll use this fact within your Puppet manifest to handle something unique to your environment.
  • You’ll build your own custom Puppet module. You’ll learn how to test the module safely prior to deploying in your production environment.
  • You’ll learn how to package your module and upload it to a Puppet Forge.
  • You’ll learn how to configure Puppet Server, allowing you to centralize Puppet services within a campus or around the globe.
  • You’ll tour the ecosystem of components that utilize, extend, and enhance Puppet within your environment.

By the time you finish this book, you will understand not just how powerful Puppet is, but also exactly how it works. You’ll have the knowledge and understanding to debug problems within any part of the infrastructure. You’ll know what to tune as your deployment grows. You’ll have a resource to use for further testing as your knowledge and experience expands.

It’s time to get declarative.

Get Learning Puppet 4 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.