Chapter 4. Docker Migration Guide
If you want to integrate Docker into your enterprise but you’re not sure how to get started, this chapter is for you. It discusses the decisions you’ll have to make as you plan a migration to Docker containers, resources that can help smooth the process, and potential pitfalls to avoid.
Planning Your Migration
The first step in migrating to Docker containers is to identify how you want to use them. This is important because, as earlier chapters have noted, Docker containers will exist alongside virtual machines or other older forms of infrastructure technology. A successful Docker strategy requires evaluation of which types of workloads you want to migrate to containers, and where your containers will fit within the rest of your infrastructure.
Questions to ask yourself as you plan a Docker migration include:
- Will you use Docker for application deployment in production environments or just for development? Docker can help your developers streamline their work by making application testing and updates easier. It can also make end-user applications more flexible and scalable, but it does not have to be used in production.
- Will you migrate legacy applications to Docker, or only deploy new applications inside containers? Since most existing applications were not designed to run as microservices, porting them so that they can be hosted by containers requires some investment—although, as the ADP case in the previous chapter shows, this can be done. In contrast, new applications can be designed from the start to run on Docker.
- Can all of your applications run on Linux? As of late 2016, native Docker support for Windows remains primitive. If the applications you want to run with Docker are Linux-compatible, this will not pose a problem. But if you have Windows applications, you may want to wait for Docker’s Windows support to mature before migrating those applications to Docker.
- Where will you host containers? Docker can run on bare-metal servers or virtual servers within your own on-premise data center. This gives you more control over the deployment and can help meet compliance and regulatory goals. Docker can also run on virtual servers within a public cloud, such as Azure or AWS. The public cloud is more scalable, but it might also pose compliance challenges, since it requires moving customer data off-premise. Weigh these different hosting options before deciding which infrastructure will run your Docker environment.
Building Your Docker Stack
Another essential task is deciding what to include in your Docker container stack. In addition to Docker itself, which provides the container runtime, a container deployment should include the following components:
- A registry for hosting container images
A variety of open source container registries are available, such as Docker Registry and CoreOS Quay. Some of these registries can be deployed on-premise, others are hosted services. Some are free, and some cost money.
- A container orchestrator
A variety of choices are available here as well. The most popular options include Swarm (an orchestrator that is integrated into Docker itself but does not have to be used), Kubernetes, and Mesos.
- Security tools
To keep your containers secure, you’ll want an image scanner, like Docker Security Scanning or Clair. Enterprise-oriented security and compliance platforms for containers, such as Twistlock, may also be wise investments, especially if security and data privacy are essential considerations for your business.
- Monitoring tools
You can collect some basic information about the status of a container from Docker itself. For more robust monitoring of your containerized applications, including analytics tools and visual dashboards, you’ll want a container-ready monitoring solution like Netuitive, Datadog, or New Relic.
- A host operating system
Docker can run on any modern Linux distribution, as well as Windows Server 2016 (although, as noted previously, Docker support on Windows is not yet fully mature). Some Linux distributions, such as RancherOS, are designed for the express purpose of hosting Docker and may be a convenient choice. On the other hand, if your IT team is already familiar with a different Linux platform, they may be more comfortable using that to host Docker. In most cases, installing Docker on any modern Linux distribution is as simple as running a few commands.
If you’re searching for a quick on-ramp to containerized infrastructure, you might consider adopting a containers-as-a-service (CaaS) platform. A CaaS eliminates the need to build a Docker software stack from the ground up.
CaaS offerings are designed to be turnkey solutions for running containers. Their major benefit is that they require much less time to set up and maintain than a Docker software stack built from scratch. Just like software-as-a-service (SaaS) and platform-as-a-service (PaaS) platforms, CaaS systems give you everything you need to run containers out of the box.
The largest drawback of CaaS platforms is that, like other SaaS and PaaS offerings, they constrict user choice to some degree. If you use a CaaS, you’ll likely be limited to using whichever orchestrators, container registries, and other components it supports. You may not have as much freedom to choose between different options as you would if you built the stack yourself.
A number of enterprise-ready CaaS options have appeared over the past two years. Major offerings include:
- Elastic Container Service, a CaaS that runs in the AWS cloud
- Azure Container Service, the Azure cloud CaaS
- Rancher, a CaaS that can be hosted in the public cloud or on an on-premise server
- OpenShift, a CaaS platform based on Red Hat Enterprise Linux
- Docker Datacenter, Docker’s own CaaS
- Oracle Container Service, a CaaS that runs in the Oracle cloud
- IBM Containers, a CaaS offered as part of IBM’s Bluemix cloud service
The main differences between these various offerings boil down to two main factors:
- The type of infrastructure they run on
Some CaaS platforms only support the public cloud. Others only run on-premise. Some can run in both settings.
- The amount of flexibility they offer
Some CaaS platforms, like Rancher, allow relatively wide freedom in choosing which types of container orchestrators and registries to use and which operating systems they can run on. Others are more limited; for example, OpenShift supports only Kubernetes and runs on top of Red Hat Enterprise Linux.
CaaS platforms are not the only way to migrate to Docker. If you can find a CaaS that fits your enterprises’s particular needs, however, it may be a faster and less complicated path to Docker implementation than one that requires you to configure and manage all components of the container stack yourself.
Mistakes to Avoid: Docker Antipatterns
Whichever route you take to implementing containers, you’ll want to steer clear of common pitfalls that can undermine the efficiency of your Docker stack.
- Don’t run too many processes inside a single container
The beauty of containers—and an advantage of containers over virtual machines—is that it is easy to make multiple containers interact with one another in order to compose a complete application. There is no need to run a full application inside a single container. Instead, break your application down as much as possible into discrete services, and distribute services across multiple containers. This maximizes flexibility and reliability.
- Don’t install operating systems inside Docker containers
It is possible to install a complete Linux operating system inside a container. In most cases, however, this is not necessary. If your goal is to host just a single application or part of an application in the container, you need to install only the essential pieces inside the container. Installing an operating system on top of those pieces adds unnecessary overhead.
- Don’t run unnecessary services inside a container
To make the very most of containers, you want each container to be as lean as possible. This maximizes performance and minimizes security risks. For this reason, avoid running services that are not strictly necessary. For example, unless it is absolutely essential to have an SSH service running inside the container—which is probably not the case because there are other ways to log in to a container, such as with the
docker execcall—don’t include an SSH service.
- Remember to include security and monitoring tools in your container stack
You can run containers without security and monitoring solutions in place. But if you want to use containers securely and efficiently, you’ll need to include these additional components in your stack.
- Don’t store sensitive data inside a container image
This is a mistake because anyone with access to the registry in which the container image is stored, or to the running container, can read the information. Instead, store sensitive data on a secure filesystem to which the container can connect. In most scenarios, this filesystem would exist on the container host or be available over the network.
- Don’t store data or logs inside containers
As noted earlier, any data stored inside a running container will disappear forever when that container shuts down, unless it is exported to a persistent storage location that is external to the container. That means you need to be sure to have such a persistent storage solution in place for any logs or other container data that you want to store permanently.
- Don’t use container registries for purposes other than storing container images
A container registry is designed to do one thing and one thing only: store container images. Although container registries offer features that might make them attractive as all-purpose repositories for storing other types of data, doing so is a mistake. Vine learned this the hard way in 2016, when the company was using a container registry to store sensitive source code, which was inadvertently exposed to the public.
A successful migration to Docker in the enterprise involves several steps:
- Identify the types of workloads that will be migrated to containers. Will you move all existing applications to containers, or only use containers for new applications? Will you continue to use virtual machines for some workloads?
- Prepare infrastructure for hosting your containers. If you don’t already have the server capacity you’ll need to run Docker, you’ll either have to buy more servers or run Docker in the cloud.
- Create a detailed plan for how you will build your container stack and which platforms you will use to provide the various components.
- Understand common mistakes made when using Docker.
The next chapter will broaden our discussion of container-like technology by taking a look at other examples of containers and microservices beyond Docker.