Chapter 1. Why Kubernetes Adoption Is Complex

Modern application design has moved from the creation of huge monoliths to a more flexible architecture based on microservices running in containers. Containers are small runtime environments that include the dependencies and configuration files the services need to run. Containers are the building blocks of the cloud native approach, enabling scalable applications in diverse environments, including public, private, and hybrid clouds, as well as bare metal and edge locations.

Beyond the significant advantage of empowering application development teams to work in parallel on different services without having to update the entirety of an application, the cloud native model offers a number of advantages over monolithic architecture from an infrastructure perspective. Containerized applications use resources more efficiently than virtual machines (VMs), can run in a broader variety of environments, and can be scaled more easily. These advantages have driven wide adoption of microservice-based architecture, containers, and the predominant container orchestration platform: Kubernetes.

Kubernetes facilitates the management of these distributed applications, allowing you to scale dynamically both horizontally and vertically as needed. Containers bring consistency of management to different applications, simplifying operational and lifecycle tasks. By orchestrating containers, Kubernetes can operationalize the management of applications across an entire environment, controlling and balancing resource consumption, providing automatic failover, and simplifying deployment.

Although Kubernetes provides a foundation for resilient and flexible cloud native application development, it introduces its own complexities to the organization. Running and managing Kubernetes at scale is no easy task, and the difficulties are compounded by the inconsistencies between different providers and environments.

Kubernetes Architecture

Kubernetes manages a cluster of physical or virtual servers, called worker nodes, each one of which hosts containers organized into pods. A separate, smaller number of servers are reserved as control plane nodes that make up the control plane for the cluster. To support multitenancy, a Kubernetes cluster offers logical separation between workloads using namespaces—a mechanism for separating resources based on ownership—to provide a virtual cluster for each team.

The control plane is the main access point that lets administrators and others manage the cluster. The control plane also stores state and configuration data for the cluster, tells worker nodes when to create and destroy containers, and routes traffic in the cluster.

The control plane consists mainly of the following components:

API Server: The access point through which the control plane, worker agents (kubelets), and users communicate with the cluster
Controller manager: A service that manages the cluster using the API server using controllers, which bring the state of the cluster in line with specifications
etcd: A distributed key-value store that contains cluster state and configuration
Scheduler: A service that manages node resources, assigning work based on availability
Kubelet: The agent running on all worker nodes to run pods via a container runtime

Figure 1-1 shows the basic components of a Kubernetes cluster.

For high availability, the control plane is often replicated by maintaining multiple copies of the essential services and data required to run the cluster (mainly the API server and etcd).

You manage every aspect of a Kubernetes cluster’s configuration declaratively (such as deployments, pods, StatefulSets, PersistentVolumeClaim, etc.), meaning that you declare the desired state of each component, leaving Kubernetes to ensure that reality matches your specification. Kubernetes maintains a controller for each object type to bring the state of every object in the cluster in line with the declared state. For example, if you declare a certain number of pods, Kubernetes ensures that when a node fails, its pods are moved to a healthy node.

Kubernetes Objects and Custom Resource Definitions

Kubernetes represents the cluster as objects. You create an object declaratively by writing a manifest file—a YAML document that describes the intended state of the object—and running a command to create the object from the file.

A controller makes sure the object exists and matches the state declared in the manifest. A controller is essentially a control loop, similar to a voltage regulator or thermostat, that knows how to maintain the state of an object within specified parameters.

A Kubernetes resource is an endpoint in the Kubernetes API that stores a certain type of object. You can create a custom resource using a custom resource definition (CRD) to represent a new kind of object. In fact, some core Kubernetes resources now use CRDs because they make it easier to extend and update the capabilities of the objects.

The Kubernetes Adoption Journey

Many organizations follow a well-traveled path in their adoption of Kubernetes, starting with experimentation before they decide whether to rely on it. The journey nearly always leads from a single cluster to the complexity of managing many clusters in different environments. Figure 1-2 shows a typical journey to Kubernetes adoption, beginning with experimentation, moving to productization, and finally to developing a managed platform.

Experimenting with Kubernetes

In the experimentation phase, developers drive investigation into the capabilities of Kubernetes by containerizing a few projects. The open nature of Kubernetes makes it easy to manage on a small scale, using the command-line interface and writing scripts to make changes to the cluster and to integrate other open source components. At this stage, the organization often has not yet engaged with security, upgrades, availability, and other concerns that become important later. As their needs change, the team makes configuration changes and integrates components gradually, often without documenting the evolution of the cluster. When the time comes to support a broader array of environments or expand access to more teams, it becomes apparent that the cluster is tailor-made for the small set of use cases it has been serving thus far.

Productizing Kubernetes for the Organization

As the organization begins to recognize the value of Kubernetes, teams begin to investigate how to scale general-purpose clusters that can serve the needs of the entire company. Departments such as service reliability and IT operations begin looking for ways to make Kubernetes secure, supportable, and manageable. These teams begin advocating for prescriptive, off-the-shelf solutions that trade flexibility for reliability. As the organization prepares to scale Kubernetes to fit its needs, it might find that these solutions are rigid and tend to silo each cluster, making cross-team work more difficult.

Developing Kubernetes as a Managed Platform

To make Kubernetes work for the business needs of the organization, it must be possible to easily deploy, maintain, and scale clusters that are highly available and can handle applications and workloads that dynamically meet changing demands. The goal of the organization must therefore be to make Kubernetes operate as a modern platform at scale, providing resources to multiple teams.

At this point, the investigation often focuses on commercial Kubernetes platforms that can solve the organization’s problems out of the box. The team likely has enough experience to look for features such as flexibility, repeatable deployment, and the ability to manage multiple clusters across different environments, including diverse cloud platforms, virtualized or bare metal data centers, and increasingly, edge locations. IT operations and platform engineering teams will especially be looking at securing cluster access, configuration management, and Day 2 concerns like scaling, upgrades, quota control, logging, monitoring, continuity, and others.

As the platform takes shape and evolves, the tendency is to break larger multitenant clusters into smaller special-purpose clusters. This approach allows more flexible management, more efficient use of resources, and a more tailored experience for the teams using the clusters within the organization. From a security standpoint, smaller clusters help make defense in depth policies easier to implement, reducing the “blast radius” in the case of a breach. Operating multiple clusters also makes it possible to deploy apps wherever needed, whether in the cloud or on premises, or even at the edge.

The Challenges of Kubernetes

Every environment deals with infrastructure differently, meaning different challenges for every type of Kubernetes deployment. Provisioning, upgrading, and maintaining the infrastructure to support the control plane can be difficult and time-consuming. Once the cluster is up, integrating basic components like storage, networking, and security presents significant hurdles. Finally, in a deployment of multiple clusters across a variety of environments, each cluster must be managed individually. There is no native tool in Kubernetes itself for managing the clusters as a group, and differences between environments introduce operational nuances to each cluster.

Below the layer of Kubernetes are the resources that the cluster needs, mainly storage, networking, and the physical or virtual machines where the cluster runs. This means that running a cluster involves two separate lifecycles that must be managed concurrently: Kubernetes and the hardware and operating system supporting it. Because Kubernetes is quickly evolving, these two layers must be kept in sync. Many problems with node or cluster availability come from incompatibilities between the versions of the operating system and the Kubernetes components. These problems become exponentially more difficult in multicluster deployments because the problem of keeping Kubernetes and the operating system in sync is multiplied by the number of platforms, each with its own complexities, then multiplied again by any specific versions or flavors required by the teams using the clusters.

Initially, Kubernetes didn’t include any tools for managing the cluster infrastructure. Bringing up machines and installing Kubernetes were manual procedures. Creating clusters outside of managed Kubernetes environments required a lot of effort and custom tooling. The Kubernetes community needed a way to provide common tools to bootstrap a cluster.

To meet this challenge, the Kubernetes community began developing tools to simplify provisioning and maintaining clusters. Here are some examples that were developed over the last few years:

Kube-up: The first tool to bring up a Kubernetes cluster (2015)
kubeadm: A tool for initializing, configuring, and upgrading Kubernetes clusters (2016)
Kops: A tool for Kubernetes lifecycle management, including infrastructure and dependencies (2016)
Kubicorn: A cluster management framework with modular support for cloud providers (2018)

These tools were a great step in the direction of integrated cluster management but fell short of a complete solution. More complex workloads began to require more sophisticated deployments, including multiple clusters. Managing these deployments, in turn, became more difficult. In particular, it was clear there was a need to be able to manage clusters consistently and declaratively using a cohesive API—the way that Kubernetes manages nodes, pods, and containers.

Get Cluster API and Declarative Kubernetes Management now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial