Chapter 1. Kubernetes Is the Operating System of the Cloud
Before diving deeper into the specifics of Portainer, I want to establish a bit of context in terms of the ecosystem Portainer comes from and its broader reasons for existence. As mentioned, I first used Portainer to manage a few Docker Swarm clusters. Over the years, the industry has backed away from Docker Swarm and toward Kubernetes as its container orchestrator of choice. My clients have followed this trend. But my initial need to provide secure access to containerized applications for teams of developers persisted—and it still does today. Luckily, Portainer quickly added support for Kubernetes as a container orchestrator.
While the underlying mechanics are fairly different between Docker Swarm and Kubernetes, Portainer managed to provide the same user experience for both systems. That made it easy for the teams I already worked with to make a smooth transition from one platform to the other. To understand why one would go with one orchestrator over another and why Portainer started to focus mainly on Kubernetes, we need to know a few things about Kubernetes itself and its own raison d’être.
In 2014, Kubernetes was developed and released by Google as an improved variant of container orchestration, building upon the knowledge Google acquired over years of running distributed systems at scale.
Note
Kubernetes was donated to the Cloud Native Computing Foundation (CNCF) in 2015 with its first stable release and has been developed as a community project under the umbrella of the CNCF since then. Its underlying architecture and design principles stemmed from Borg and Omega, Google’s internal cluster management system. Many of the developers that worked on Borg and Omega at Google work on Kubernetes now.
Given its lineage and being open source, Kubernetes (or K8s for short) soon became the dominating container orchestration engine in the cloud native ecosystem. It was built to handle huge distributed workloads, and the community efforts evolving around it made sure the software was keeping up with the requirements of the big cloud providers and individual enterprises looking to run containers in production. One of the first companies to incorporate Kubernetes into their commercial tech stack was Red Hat, which used it as the foundation for its OpenShift Container Platform.
In order to understand what makes Kubernetes the operating system of the cloud, we need to have a better understanding of the history of containers and how they changed the way we build and deploy software today.
A Few Words on Containers
Simply put, containers are packages of software that contain all of the necessary elements to run in any environment.
With the advent of Docker in 2013, the way we perceive and use containers has changed. Before it, software containers used to work a lot like virtual machines (VMs)—carrying a whole operating system, running multiple long-running applications (i.e., services), and giving you access through Secure Shell (SSH). The main difference between containers and VMs was that VMs emulated the full hardware stack, isolating the system fully from the underlying operating system, while containers worked in a change-root environment, being dependent on the kernel and resources of the surrounding operating system.
Docker established a new paradigm by stripping away pretty much everything but the actual application you want to run and its most important dependencies. While before you had to run a full web application stack, including the web server, database, and anything else you needed in a single container, with Docker you could split each of those services into their own containers and have them interface with each other over the network.
If done correctly, a Docker container only runs a single process, and if that process is being terminated, the container is too. Unlike virtual machines or “legacy” containers, this way of doing things comes closer to how process management in an operating system works. It requires a different kind of planning and resource management when you need multiple processes to work together, but it also allows you to properly isolate workloads and reduce the risk surface of applications to a minimum.
With containers only running a single process and its most important dependencies, Docker also enables you to make those applications portable between different operating systems since a container is no longer dependent on its underlying or surrounding operating system. Everything is perfectly isolated and only interfaces with the outside world over the network or by specifically mounting directories from the host machine into the container. As long as the server you want to run the container on supports the Docker Engine, you can rest assured that your container will run perfectly fine on it.
So, What Is Kubernetes?
That level of portability created new possibilities in terms of scalability. It’s very common to run an application on multiple servers and place a load balancer in front of it to achieve higher stability or better performance. We call this horizontal scaling. Before, the main mechanic to scale an application was to vertically scale the server it was running on—i.e., giving more resources to a single machine. That way of scaling has its limitations in terms of hardware and availability since upgrading machines typically results in downtime.
With Docker, it became very easy to run the same container on different machines and achieve horizontal scaling. Docker Swarm was the first approach to incorporate fast up- and downscaling over multiple clustered machines and an integrated load-balancing mechanism into the Docker Engine itself. You could easily upgrade a single container to a service and increase its replicas, having Docker Swarm manage the placement of the replicas on the nodes in the cluster and handle the routing of traffic through its integrated overlay network.
Kubernetes took the concepts of containerized software and workload orchestration to a whole new level, providing well-designed abstractions for handling compute, network, and storage over distributed servers. While Docker Swarm exposes a human-friendly interface to orchestrate containers, it still lacks many advanced features. There’s no production-grade support for distributed storage mechanisms apart from NFS (Network File System) or SMB (Server Message Block), and it has no concept of network policies or support for custom ingress implementations for advanced handling of incoming traffic. Kubernetes—stemming from Google’s experience in running distributed systems at scale—delivers all those things and more by providing generalized building blocks instead of specific implementations.
In Kubernetes, everything is handled by APIs whose implementations are decoupled from the core logic of the API server. As a result, the developer community was able to build implementations for specific components like network, storage, or ingress independently from the core logic that keeps everything running. This works a lot like modern operating systems (OS) in that most of the integral parts of the system have been compartmentalized to specify clear interfaces for their domains without providing the actual logic to do anything, giving end users the opportunity to use their own tooling to handle the domain-specific necessities.
A good example for this is the networking stack in Linux. While most distributions use Linux’s own implementation of the TCP stack, it’s absolutely fine and sometimes desirable to replace it with something homegrown.
Note
When most people think of an OS, they think of Microsoft Windows or Linux, which are specific implementations of what an OS actually is: low-level system software that manages computer hardware and software resources and provides common services for computer programs.
Looking at Kubernetes from a high level (see Figure 1-1), it does exactly this: handling memory allocation for processes, providing ways for processes to communicate, and providing interfaces to the underlying hardware. The difference between a classical OS and Kubernetes is that Kubernetes does this for a cluster of machines instead of a single one. With that said, a container becomes a single process, and Kubernetes works like the kernel of an OS, making sure that the process gets what it needs to work properly.
The Road to Microservices
Modern operating systems profit a lot from being able to share resources efficiently between processes. Pretty much every OS comes with some kind of package manager or service registry, allowing software developers to reuse components and access a central index of “what’s there.” One good example of this is how apt (advanced packaging tool) works in debian-based operating systems. Instead of including every necessary library or tool in their own code, developers can reference shared libraries and programs that already exist in the OS, making it very easy to run software without causing bloat. Much of how this works today stems from the UNIX philosophy: do one thing and do it well.
Note
The Unix philosophy is about building simple and modular code that is highly extensible and can easily be repurposed or composed with other components to build bigger solutions with a clear separation of concerns. In other words: keep it simple; do one thing and do it well.
But how does this translate to Kubernetes? Well, instead of processes, software developers often refer to the software running in containers as microservices. Before the rise of containers, software development had a very monolithic character—there was a single, often huge code base everybody worked on, and software was mostly running on a single machine, making it necessary to handle interprocess communication within the application itself or rely on the underlying operating system to provide the necessary resources.
Today, most of the software that is powering the web as we know it is designed to have a clear API and do one thing—and do it well. Instead of writing huge and complex software monoliths, developers started to further compartmentalize their programs into single, independent microservices that talk to each other through their respective APIs over the network. These services are often stateless, meaning they can easily be scaled to multiple replicas for increased throughput or availability.
This new way of building software made it necessary to have a higher-level system that was able to orchestrate these microservices, schedule them on different servers, allocate resources to them, and provide ways for them to communicate with each other. That’s what Kubernetes is doing for them: like an OS, it provides access to shared resources, enables them to know about each other, and makes sure to keep everything running smoothly.
If you’ve ever worked with containers, this is nothing new for you. We commonly refer to this way of building software as cloud native architecture. The cloud was one of the biggest driving factors of technological innovation in the past decade and enabled many new patterns of building and running software, but it also introduced new challenges for software developers. In the following chapter, I will highlight some of these challenges and how Kubernetes and Portainer evolved as solutions to common problems in cloud native architecture.
Get What Is Portainer? now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.