Chapter 1. Why Kubernetes in Production Is Hard
Kubernetes is the foundation for the modern cloud native application architecture. By running microservices in containers, the enterprise can move from a hardware-defined model, tied to physical or virtual machines, to a software-defined paradigm that enables horizontal scale across on-premises, cloud, and hybrid environments.
When moving to Kubernetes, day one is not as straightforward as simply deploying software. While the number of people who are well versed in Kubernetes operations is growing, there are still challenges. The following sections focus on three of the most significant challenges: automating storage, maintaining a cluster, and managing storage designed for a different computing era.
Automating Storage
Kubernetes was originally designed to handle stateless workloads, which don’t store data or other information about previous operations. While Kubernetes has evolved to handle stateful applications, which store persistent data, it has not grown to automate storage the same way it handles compute resources. Instead, Kubernetes relies on storage that sits outside the orchestration system.
Automating storage is not easy, and storage administrators often have plenty of work to do. Provisioning storage as workloads appear and disappear, or as applications scale up and down, can be time-consuming operations even when assisted by automation. Configuring systems to add, change, and remove resources in response to thresholds and events is a complex job.
While Kubernetes load-balances services and requests within the cluster, it doesn’t really know how to load-balance storage. Different workloads, applications, and use cases have different storage usage patterns. Storage administrators, responding dynamically to changing storage needs, must constantly consider when and where to scale out storage, how much and where to grow volumes, and how to load-balance storage requests across the cluster. Without integrated tools, it can be very difficult to automate these tasks sufficiently to keep applications running smoothly.
In a modern DevOps software development environment, automation is key. For the enterprise to adopt Kubernetes, it must be possible to automatically scale, balance, and protect data as demand changes. For the storage administrator, these are often high-touch efforts. Managing storage capacity can be time-consuming and disruptive to running applications. To keep efforts focused on application development means reducing this overhead through automation as much as possible. This requires storage platforms and tools that are aware of each application’s API and storage requirements, and are capable of managing storage automatically in ways analogous to how Kubernetes handles compute resources.
Maintaining a Kubernetes Cluster
Kubernetes abstracts compute infrastructure for distributed cloud native applications, decoupling them from the hardware where they run. Hardware health, failure, and other concerns no longer bear directly on the performance of the software. When hardware fails, the containers that contain the software move automatically and the application keeps running.
However, the underlying hardware that supports the cluster requires continual monitoring to ensure availability to applications and their users. While Kubernetes manages and moves containers in response to changing conditions and demands, any enterprise with an on-premises deployment in its portfolio must maintain physical servers to keep the cluster running. The IT team must ensure that it can support the service level agreements (SLAs) that the business requires. This involves replacing hard drives and other components as they fail, which is a very common occurrence at scale.
This maintenance often requires migrating large pools of replicated data from one storage infrastructure to another, which is complicated by the different needs of the various applications that use the storage. It’s nearly impossible to achieve a significant manual migration without downtime and without losing any application’s operating state. An intelligent storage platform can help, easing the task of protecting and migrating data as hardware is provisioned and managed, or as applications move to new environments.
Storage for Another Era
Traditional applications, which run directly on physical or virtual machines, have simpler storage needs than distributed, containerized applications. Monolithic application state is often stored in a shared, mutable table that is straightforward to back up. In fact, some applications running on VMs rely only on local storage. In such cases, backing up the VM is sufficient to back up not only the application’s data but its configuration and running state as well.
Containerized applications are different. Because containers are immutable and ephemeral, any workloads that require persistent data must connect to an external system. Backing up a container is not sufficient to capture persistent data, since the data doesn’t reside in the container itself. Instead, Kubernetes provides a mechanism for associating a unit of persistent storage with a pod, one or more containers running on a single host node and sharing resources.
Kubernetes typically maintains a set number of replicas of every active pod to ensure application availability. Each pod can define one or more local or remote volumes, or cohesive units of persistent storage space. Volumes provide the capability to persist data beyond the lifetime of the containers or the pod itself. Multiple containers in a pod can mount and access the same volume, so applications can share data between containers that have different tasks. For example, an init container can run before the service starts, creating a custom configuration file for the environment where the application is running.
Initially, extending the capabilities of Kubernetes storage meant changing the Kubernetes codebase itself. Kubernetes has deprecated this “in-tree” approach, replacing it with an API called the Container Storage Interface (CSI), which enables the development of storage plug-ins without changing any Kubernetes code. The CSI provides a way to support multiple storage interfaces that allow containerized workloads to store persistent data on different types of externally managed storage pools (Figure 1-1).
Although this model solves some problems, it fails to bring storage into the virtualized, software-defined infrastructure where containerized applications run. While the compute resources used by modern enterprise applications run on application-aware infrastructure, provisioned and managed declaratively and driven by the end user, storage remains a physical concept tied to virtual or physical servers. Because cloud native applications are distributed, backing up any given VM is likely to capture partial data from multiple applications, while failing to store complete data from any single application.
For this reason, on-premises and cloud-based storage systems designed for physical or virtual machines are poorly suited to the scale and complexity of containerized applications. While containers are software defined, disposable, replaceable, and decoupled from hardware, traditional storage is concerned with managing pools of physical media. Containerized applications are highly dynamic, scaling up and down rapidly as demand changes by creating, destroying, and moving containers automatically. Traditional storage methods don’t respond quickly enough to support these modern architectures.
For file, block, and object storage, cloud native storage solutions emerged to orchestrate a software-defined pool of persistent storage for access across a Kubernetes cluster, using a containerized architecture to provide dynamic storage at scale. These solutions work well, but they often target specific types of storage, a few filesystems, and a selection of database services.
The enterprise needs software-defined, general-purpose storage that’s capable of scaling up to meet the demands of big data: streaming, batch processing, transactional databases, high availability and disaster recovery, data locality, and data mobility. To support fast recovery time objectives, backups must incorporate all application state, including data and configuration.
Specific application types have additional, more stringent needs. For example, database applications need to guarantee that data transactions are ACID (atomic, consistent, isolated, and durable) and usually require secondary indexes. Because pods are ephemeral and movable, it is difficult to meet these guarantees in a containerized environment.
To make Kubernetes work for the enterprise, storage solutions must work at the container level rather than the VM level, must be aware of Kubernetes namespaces, and must be able to back up entire applications across VMs, including their configuration and state.
Get Container Storage and Data Protection for Applications on Kubernetes now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.