Chapter 1. Understanding Docker
If you’re reading this book, you have probably heard of Docker. It is the most popular container platform available today—and the one that is doing the most to reshape the way organizations handle IT workloads. Indeed, if you do anything related to IT, it’s hard not to have heard of Docker by now.
But do you understand why Docker has become so popular so quickly? Do you know which types of challenges Docker can solve for an enterprise (by which I mean large-scale organizations with complex computing needs)? Are you familiar with the different sets of tasks—from running a container, to managing a container cluster, to monitoring containerized applications—that are necessary to deploy Docker in production within an enterprise environment?
This chapter provides an introduction to these topics by explaining the basics of how Docker works, briefly discussing the history of Docker, and analyzing where Docker fits within the broader ecosystem of container technologies. Subsequent chapters delve deeper into Docker’s functionality, compare Docker to virtual machines, explain how to plan a Docker migration, and discuss other types of container and microservices technologies that can help enterprises achieve a new level of efficiency in their IT operations.
How Docker Containers Work
Let’s start with an overview of Docker containers.
The first thing to know about Docker is that it is an application-container platform. An application container has these characteristics:
- It is a software-defined environment. The container can be created and run entirely through software processes; it does not require any special hardware.
- The application that runs inside the container must be designed to run on the operating system that hosts the container. For example, an application running inside a container on a Linux server needs to be Linux-compatible.
- It is abstracted from the host system via controls that limit the ability of processes inside the container to interact with processes on the host. This isolation provides some security benefits, while also simplifying administration.
- In most cases, the container is also subject to limits on the amount of computing, storage, and networking resources that the containerized application can access.
These characteristics make application containers similar to virtual machines in some ways. However, as Chapter 2 explains in detail, application containers differ from virtual machines in two key ways. First, application containers serve as host environments for individual applications, not entire operating systems. Second, application containers work by executing processes that are managed by the operating system of the container host. Even though those processes run inside a special, software-defined environment, they are still managed by the host server. In contrast, virtual machines run processes inside a virtual hardware server, with the host machine playing no direct role in managing those processes. For more about the differences between Docker containers and virtual machines, refer to Chapter 2.
It’s also worth noting that Docker is not the only platform for creating application containers, application containers are not the only type of container, and Docker can technically be used to create other types of containers beyond application containers. We’ll get to these points later in this book—specifically in Chapter 5, which discusses where Docker containers fit within the larger container landscape.
Creating, Running, Managing, and Monitoring Containers
The Docker software platform consists of tools for creating, running, managing, and monitoring containers. Here is what each of these tasks entails:
- Container creation
To launch a Docker container, you first need to create a container image. Each container image is based on a Dockerfile, the blueprint that Docker uses to launch an instance of your application inside a container. The
docker buildcommand generates a container image based on a Dockerfile. In large-scale deployments, Docker container images are generally hosted in a container registry, which serves as a repository from which users can download the images.
- Running a container
To run a Docker container, you typically use the
docker pullcommand to download the image that contains the application you want to run inside a container. You then start the container with
docker start. Once the container is up, you can interact with the application it hosts using commands such as
docker run, which allow you to execute commands inside the container.
- Container management
Docker provides a basic framework for managing containers. You can start, stop, and pause them, and perform other basic management tasks. However, Docker itself is not designed as a full-fledged container management solution, and it is not practical to start and stop many containers manually in a production environment. To manage a large number of containers in production, you would use a container orchestrator, such as Swarm (which ships as part of the Docker package but can be switched off and replaced with an alternative, non-Docker orchestrator) or Kubernetes. An orchestrator automates most of the work required to start and stop containers, and to assure availability across a cluster of container hosts.
- Container monitoring
Docker also provides some basic container monitoring features through commands such as
docker stats, which displays information about container resource usage, and
docker top, which lists the processes running inside a container. Here again, however, Docker itself is not intended to be a complete monitoring solution. To monitor containers in production, organizations will be best served by a third-party container-ready monitoring solution. Datadog, Splunk, and (through the CloudWatch monitoring service) Amazon Web Services (AWS) are examples of vendors currently offering container monitoring solutions.
These are the fundamentals of how Docker works. In the next chapter, we’ll look more extensively at what makes Docker’s functionality useful within an enterprise.
A Brief History of Docker
But first, let’s examine where Docker came from, and how the Docker platform has evolved since it first became publicly available in the spring of 2013.
From dotCloud to Docker, Inc.
The software that became Docker was born as a project at a French company called dotCloud. The company’s main offering was a Platform-as-a-Service (PaaS) that provided hosting for Web apps and databases for clients. To help build the dotCloud hosting infrastructure, the company’s engineers leveraged certain features that were baked into the Linux kernel, including LXC and cgroups. (For a longer history of these technologies, see Chapter 3.)
These features, the building blocks for what became Docker containers, had existed for years before dotCloud engineers started working with them. While dotCloud did not invent them, its engineers did introduce a major innovation by creating a standard API, which simplified the process of building a container using LXC.
At first, dotCloud maintained Docker as an internal project. But in March 2013, shortly after demonstrating Docker to an enthusiastic crowd at PyCon, dotCloud released Docker as an open source project.1 That meant anyone could download, run, and modify the code.
From there, the Docker project quickly eclipsed dotCloud’s PaaS business. DotCloud changed its name to Docker, Inc. in October 2013. The company secured tens of millions of dollars of funding over four rounds between 2013 and 2015, and became valuable enough that Microsoft unsuccessfully tried to buy it for $4 billion in 2016.
The Evolution of Docker Code
That’s the history of Docker the company. But it’s important to understand that Docker, Inc. is not the same thing as the open source project that develops the Docker platform.
A detailed history of the Docker platform would require delving deep into the history of LXC. But the short of it is that, again, the Docker software project was born as an easier way to interact with LXC. About a year after the Docker code was made open source, the project migrated away from LXC by adopting a different framework, called libcontainer, as the basis for creating and managing Docker containers.
Since the code went public, the Docker technology has also matured in ways that make Docker much more enterprise-friendly. The biggest changes include the following:
- Docker developers introduced data volumes to help provide persistent storage for applications running inside containers.
- Docker container networking has grown more robust and sophisticated than the basic single network connection that was available when dotCloud first demoed Docker in the spring of 2013.
- Docker created Swarm, an orchestration tool for managing many containers running as part of a cluster. Swarm now ships as part of the core Docker package, although users can optionally replace it with a different orchestration platform, such as Kubernetes.
These changes have all made it easier to work with Docker containers at scale inside production environments. In this way, they have readied Docker for an enterprise.
Alternatives to Docker
A discussion of Docker would not be complete without mentioning competing application-container solutions.
As of the time of writing, there are two main alternatives to Docker’s container platform: Rocket (Rkt), a container runtime developed primarily by CoreOS; and the Open Container Initiative Daemon (OCID), which is backed by Red Hat.
When I say that Rkt and OCID are Docker competitors, I mean that they are alternatives to Docker’s libcontainer runtime. These are technologies that make it possible to create and run application containers without using Docker’s software.
However, Rkt and OCID are merely container runtimes. They do not compete with Docker at other levels of the container stack, such as orchestration. No matter which runtime you use to create containers, you will still want an orchestrator like Swarm or Kubernetes to manage them at scale.
So far, Docker’s runtime clearly dominates in the market. However, Rkt and OCID enjoy support from proponents of an open, standardized container format, particularly the one being developed by the Open Container Initiative (OCI).
While Docker is a member of the OCI and its container format is currently compatible with OCI draft specifications, supporters of Rkt and OCID contend that if Docker enjoys outsized market share over the long term, it’s in the position to deviate from community-accepted container formats. That could lead to compatibility and interoperability issues, potentially locking competitors out of the market.
For now, there is no evidence of any of this happening. The safe bet is on Docker continuing to control the application-container market for the foreseeable future, while remaining compatible with OCI specifications. Still, it is important to note that Docker is by no means the only option out there for building application containers.
If you want to understand what has made Docker so popular and how it has come to exert such a large impact on the technology sector in a few short years, read on to Chapter 2.
This chapter introduced the following key points:
- Docker is a system designed primarily for running individual applications inside containers. This makes it different from virtual machines, which host entire operating systems inside virtual environments.
- A complete container software stack includes not just a runtime but also a registry for hosting container images, an orchestrator for managing clusters of containerized applications, and a persistent storage solution for container data.
- Docker’s libcontainer runtime—the low-level process that allows a Docker host to start and manage containers—is only one of several container runtimes. Other runtime options with strong commercial support include Rkt and OCID.
1 At the 2013 PyCon demonstration of Docker, the first time anyone outside of dotCloud saw Docker in action, the company displayed an image of Lego men loading shipping containers onto a dock, suggesting how the Docker project got its name.