What is Puppet?

An introduction to configuration management and one of its most popular tools.

By Jo Rhett

April 25, 2016

Shadow play (source: Wikimedia Commons)

This book will teach you how to install and use Puppet for managing computer systems. It will introduce you to how Puppet works, and how it provides value. To better understand Puppet and learn the basics, you’ll set up a testing environment, which will evolve as your Puppet knowledge grows. You’ll learn how to declare and evaluate configuration policy for hundreds of nodes.

This book covers modern best practices for Puppet. You’ll find tips throughout the book labeled Best Practice.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

You’ll learn how to update Puppet 2 or Puppet 3 puppet code for the increased features and improved parser of Puppet 4. You’ll learn how to deploy the new Puppet Server. You’ll have a clear strategy for upgrading older servers to Puppet 4 standards. You’ll learn how to run Puppet services over IPv6 protocol.

Most important of all, this book will cover how to scale your Puppet installation to handle thousands of nodes. You’ll learn multiple strategies for handling diverse and heterogenous environments, and reasons why each of these approaches may or may not be appropriate for your needs.

Puppet manages configuration data, including users, packages, processes, and services—in other words, any resource of the node you can define. Puppet can manage complex and distributed components to ensure service consistency and availability. In short, Puppet brings computer systems into compliance with a configuration policy.

Puppet can ensure configuration consistency across servers, clients, your router, and that computer on your hip. Puppet utilizes a flexible hierarchy of data sources combined with node-specific data to tune the policy appropriately for each node. Puppet can help you accomplish a variety of tasks. For example, you can use it to do any of the following:

Deploy new systems with consistent configuration and data installed.
Upgrade security and software packages across the enterprise.
Roll out new features and capabilities to existing systems painlessly.
Adjust system configurations to make new data sources available.
Decrease the cost and effort involved in minor changes on hundreds of systems.
Simplify the effort and personnel involved in software deployments.
Automate buildup and teardown of replica systems to test proposed changes.
Repurpose existing computer resources for a new use within minutes.
Gather a rich data set of information about the infrastructure and computing resources.
Provide a clear and reviewable change control mechanism.

Twenty years ago, people were impressed that I was responsible for 100 servers. At a job site last year, I was responsible for over 17,000 servers. At my current job I’d have to go check somewhere to find out, as we scale up and down dynamically based on load. These days, my Puppet code spins up more servers while I’m passed out asleep than I did in the first 10 years of my career. You can achieve this only by fully embracing what Puppet provides.

I was recently reminded of something I quipped to a CEO almost six years ago:

You can use Puppet to do more faster if you have ten nodes.
You must use Puppet if you have ten hundred nodes.

Puppet enables you to make a lot of changes both quickly and consistently. You don’t have to write out every step, you only have to define how it should be. You are not required to write out the process for evaluating and adjusting each platform. Instead, you utilize the Puppet configuration language to declare the final state of the computing resources. Thus, we describe Puppet as declarative.

Why Declarative

When analyzing hand-built automation systems, you’ll invariably find commands such as the following:

$ echo "param: newvalue" >> configuration-file

This command appends a new parameter and value to a configuration file. This works properly the first time you run it. However, if the same operation is run again, the file has the value twice. This isn’t a desirable effect in configuration management. To avoid this, you’d have to write code that checks the file for the configuration parameter and its current value, and then makes any necessary changes.

Language that describes the actions to perform is called imperative. It defines what to do, and how to do it. It must define every change that should be followed to achieve the desired configuration. It must also deal with any differences in each platform or operating system.

When managing computer systems, you want the operations applied to be idempotent, where the operation achieves the same results every time it executes. This allows you to apply and reapply (or converge) the configuration policy and always achieve the desired state.

To achieve a configuration state no matter the existing conditions, the specification must avoid describing the actions required to reach the desired state. Instead, the specification should describe the desired state itself, and leave the evaluation and resolution up to the interpreter.

Language that declares the final state is called declarative. Declarative language is much easier to read, and less prone to breakage due to environment differences. Puppet was designed to achieve consistent and repeatable results. Every time Puppet evaluates the state of the node, it will bring the node to a state consistent with the specification.

How Puppet Works

Any node you control contains an application named puppet agent. The agent evaluates and applies Puppet manifests, or files containing Puppet configuration language that declares the desired state of the node. The agent evaluates the state of each component described in a manifest, and determines whether or not any change is necessary. If the component needs to be changed, the agent makes the requested changes and logs the event.

If Puppet is configured to utilize a centralized Puppet server, Puppet will send the node’s data to the server, and receive back a catalog containing only the node’s specific policy to enforce.

Now you might be thinking to yourself, “What if I only want the command executed on a subset of nodes?” Puppet provides many different ways to classify and categorize nodes to limit which resources should be applied to which nodes. You can use node facts such as hostname, operating system, node type, Puppet version, and many others. Best of all, new criteria custom to your environment can be easily created.

The Puppet agent evaluates the state of only one node. In this model, you can have agents on tens, hundreds, or thousands of nodes evaluating their catalogs and applying changes on their nodes at exactly the same time. The localized state machine ensures a scalable and fast parallel execution environment.

Post topics: Operations