Chapter 1. Kubernetes and OpenShift Overview
Over the past few years, Kubernetes has emerged as the de facto standard platform for managing, orchestrating, and provisioning container-based cloud native computing applications. Cloud native computing applications are essentially applications that are built from a collection of smaller services (microservices) and take advantage of the speed of development and scalability capabilities that cloud computing environments typically provide. Over time, Kubernetes has matured to provide the controls required to manage even more advanced and stateful workloads, such as databases and AI services. The Kubernetes ecosystem continues to experience explosive growth, and the project benefits greatly from being a multiple-vendor and meritocracy-based open source project backed by a solid governance policy and a level playing field for contributing.
Although many Kubernetes distributions are available for customers to choose from, the Red Hat OpenShift Kubernetes distribution is of particular interest. OpenShift has achieved broad adoption across a variety of industries; over one thousand enterprise customers across the globe currently use it to host their business applications and drive their digital transformation efforts.
This book focuses on enabling you to become an expert at running both traditional Kubernetes and the OpenShift distribution of Kubernetes in production environments. In this first chapter, we begin with a broad overview of both Kubernetes and OpenShift and the historical origin of both platforms. We then review the key features and capabilities that have made Kubernetes and OpenShift the dominant platforms for creating and deploying cloud native applications.
Kubernetes: Cloud Infrastructure for Orchestrating Containerized Applications
The emergence of Docker in 2013 introduced numerous developers to containers and container-based application development. Containers were presented as an alternative to virtual machines (VMs) for creating self-contained deployable units. Containers rely on advanced security and resource management features of the Linux operating system to provide isolation at the process level instead of relying on VMs for creating deployable units of software. A Linux process is much more lightweight and orders of magnitude more efficient than a VM for common activities like starting up an application image or creating new image snapshots. Because of these advantages, developers favored containers as the desired approach to creating new software applications as self-contained units of deployable software. As the popularity of containers grew, so did a need for a common platform for provisioning, managing, and orchestrating containers across a cluster. For more than a decade, Google had embraced the use of Linux containers as the foundation for applications deployed in its cloud.1 Google had extensive experience orchestrating and managing containers at scale and had developed three generations of container-management systems: Borg, Omega, and Kubernetes. The latest generation of container management developed by Google, Kubernetes was a redesign based on lessons learned from Borg and Omega and was made available as an open source project. Kubernetes delivered several key features that dramatically improved the experience of developing and deploying a scalable container-based cloud application:
- Declarative deployment model
- Most cloud infrastructures that existed before Kubernetes was released took a procedural approach based on a scripting language like Ansible, Chef, Puppet, and so on for automating the deployment of applications to production environments. In contrast, Kubernetes used a declarative approach of describing what the desired state of the system should be. Kubernetes infrastructure was then responsible for starting new containers when necessary (e.g., when a container failed) to achieve the desired declared state. The declarative model was much more clear at communicating which deployment actions were desired, and this approach was a huge step forward compared with trying to read and interpret a script to determine what the desired deployment state should be.
- Built-in replica and autoscaling support
- In some cloud infrastructures that existed before Kubernetes, support for replicas of an application and autoscaling capabilities were not part of the core infrastructure and, in some cases, never successfully materialized due to platform or architectural limitations. Autoscaling refers to the ability of a cloud environment to recognize that an application is becoming more heavily used, so the cloud environment automatically increases the capacity of the application, typically by creating more copies of the application on extra servers in the cloud environment. Autoscaling capabilities were provided as core features in Kubernetes and dramatically improved the robustness and consumability of its orchestration capabilities.
- Built-in rolling upgrades support
- Most cloud infrastructures do not provide support for upgrading applications. Instead, they assume the operator will use a scripting language, such as Chef, Puppet, or Ansible, to handle upgrades. In contrast, Kubernetes actually provides built-in support for rolling out upgrades of applications. For example, Kubernetes rollouts are configurable such that they can leverage extra resources for faster rollouts that have no downtime, or they can perform slower rollouts that do canary testing, reducing the risk and validating new software by releasing software to a small percentage of users to ensure that the new version of the application is stable. Kubernetes also supports pausing, resuming, and rolling back the version of an application.
- Improved networking model
- Kubernetes mapped a single IP address to a pod, which is Kubernetes’s smallest unit of container deployment, aggregation, and management. This approach aligned the network identity with the application identity and simplified running software on Kubernetes.2
- Built-in health-checking support
- Kubernetes provided container health-checking and monitoring capabilities that reduced the complexity of identifying when failures occur.
Even with all the innovative capabilities available in Kubernetes, many enterprise companies were still hesitant to adopt this technology because it was an open source project supported by a single vendor. Enterprise companies are careful about which open source projects they are willing to adopt, and they expect open source projects like Kubernetes to have multiple vendors contributing to them; they also expect open source projects to be meritocracy based with a solid governance policy and a level playing field for contributing. In 2015, the Cloud Native Computing Foundation (CNCF) was formed to address these issues facing Kubernetes.
CNCF Accelerates the Growth of the Kubernetes Ecosystem
In 2015, the Linux Foundation initiated the creation of the CNCF.3 The CNCF’s mission is to make cloud native computing ubiquitous.4 In support of this new foundation, Google donated Kubernetes to the CNCF to serve as its seed technology. With Kubernetes as the core of its ecosystem, the CNCF has grown to more than 440 member companies, including Google Cloud, IBM Cloud, Red Hat, Amazon Web Services (AWS), Docker, Microsoft Azure, VMware, Intel, Huawei, Cisco, Alibaba Cloud, and many more.5 In addition, the CNCF ecosystem has grown to hosting 26 open source projects, including Prometheus, Envoy, gRPC, etcd, and many others. Finally, the CNCF nurtures several early-stage projects and has had eight projects accepted into its Sandbox program for emerging technologies.
With the weight of the vendor-neutral CNCF foundation behind it, Kubernetes has grown to having more than 3,200 contributors annually from a wide range of industries.6 In addition to hosting several cloud native projects, the CNCF provides training, a Technical Oversight Board, a Governing Board, a community infrastructure lab, and several certification programs to boost the ecosystem for Kubernetes and related projects. As a result of these efforts, there are currently over one hundred certified distributions of Kubernetes. One of the most popular distributions of Kubernetes, particularly for enterprise customers, is Red Hat’s OpenShift Kubernetes. In the next section, we introduce OpenShift and give an overview of the key benefits it provides for developers and IT operations teams.
OpenShift: Red Hat’s Distribution of Kubernetes
Although many companies have contributed to Kubernetes, the contributions from Red Hat are particularly noteworthy. Red Hat has been a part of the Kubernetes ecosystem from its inception as an open source project, and it continues to serve as the second-largest contributor to Kubernetes. Based on this hands-on expertise with Kubernetes, Red Hat provides its own distribution of Kubernetes that it refers to as OpenShift. OpenShift is the most broadly deployed distribution of Kubernetes across the enterprise. It provides a 100% conformant Kubernetes platform and supplements it with a variety of tools and capabilities focused on improving the productivity of developers and IT operations.
OpenShift was originally released in 2011.7 At that time, it had its own platform-specific container runtime environment.8 In early 2014, the Red Hat team met with the container orchestration team at Google and learned about a new container orchestration project that eventually became Kubernetes. The Red Hat team was incredibly impressed with Kubernetes, and OpenShift was rewritten to use Kubernetes as its container orchestration engine. As a result of these efforts, OpenShift was able to deliver a 100% conformant Kubernetes platform as part of its version 3 release in June 2015.9
The Red Hat OpenShift Container Platform is Kubernetes with additional supporting capabilities to make it operational for enterprise needs. The Kubernetes community provides fixes for releases for a period of up to 12 months. OpenShift differentiates itself from other distributions by providing long-term support (three or more years) for major Kubernetes releases, security patches, and enterprise support contracts that cover both the operating system and the OpenShift Kubernetes platform. Red Hat Enterprise Linux (RHEL) has long been a de facto distribution of Linux for organizations large and small. Red Hat OpenShift Container Platform builds on RHEL to ensure consistent Linux distributions from the host operating system through all containerized functions on the cluster. In addition to all these benefits, OpenShift enhances Kubernetes by supplementing it with a variety of tools and capabilities focused on improving the productivity of both developers and IT operations. The following sections describe these benefits.
Benefits of OpenShift for Developers
While Kubernetes has a lot of functionality for provisioning and managing container images, it does not provide much support for creating new images from base images, pushing images to registries, or identifying when new versions become available. In addition, the networking support provided by Kubernetes can be quite complicated to use. To fill these gaps, OpenShift offers several benefits to developers beyond those provided by the core Kubernetes platform:
- When using basic Kubernetes, the cloud native application developer is responsible for creating their own container images. Typically, this involves finding the proper base image and creating a
Dockerfilewith all the necessary commands for taking a base image and adding in the developer’s code to create an assembled image that Kubernetes can deploy. This requires the developer to learn a variety of Docker commands that are used for image assembly. With its Source-to-Image (S2I) capability, OpenShift is able to handle merging the cloud native developer’s code into the base image. In many cases, S2I can be configured such that all the developer needs to do is commit their changes to a Git repository, and S2I will see the updated changes and merge them with a base image to create a new assembled image for deployment.
- Push images to registries
- Another key step that must be performed by the cloud native developer when using basic Kubernetes is storing newly assembled container images in an image registry such as Docker Hub. In this case, the developer needs to create and manage the repository. In contrast, OpenShift provides its own private registry and developers can use that option, or S2I can be configured to push assembled images to third-party registries.
- Image streams
- When developers create cloud native applications, the development effort results in a large number of configuration changes, as well as changes to the container image of the application. To address this complexity, OpenShift provides the image stream functionality, which monitors for configuration or image changes and performs automated builds and deployments based on the change events. This feature takes the burden off the developer of having to perform these steps manually whenever changes occur.
- Base image catalog
- OpenShift provides a base image catalog with a large number of useful base images for a variety of tools and platforms, such as WebSphere Liberty, JBoss, PHP, Redis, Jenkins, Python, .NET, MariaDB, and many others. The catalog provides trusted content that is packaged from known source code.
- Networking in base Kubernetes can be quite complicated to configure. OpenShift has a route construct that interfaces with Kubernetes services and is responsible for adding Kubernetes services to an external load balancer. Routes also provide readable URLs for applications and a variety of load-balancing strategies to support several deployment options, such as blue-green, canary, and A/B testing deployments.10
While OpenShift has a large number of benefits for developers, its greatest differentiators are the benefits it gives IT operations. In the next section, we describe several of the core capabilities for automating the day-to day-operations of running OpenShift in production.
Benefits of OpenShift for IT Operations
In May 2019, Red Hat announced the release of OpenShift 4.11 Red Hat acquired CoreOS, which had a very automated approach to managing Kubernetes’s life-cycle behavior and was an early advocate of the “operator” concept. This new version of OpenShift was completely rewritten to build on capabilities from CoreOS’s innovative management practices and OpenShift 3’s reputation for reliability, which dramatically improved how the OpenShift platform is installed, upgraded, and managed.11 To deliver these significant life-cycle improvements, OpenShift heavily used the latest Kubernetes innovations and best practices for automating the management of resources in its architecture. As a result of these efforts, OpenShift 4 is able to deliver the following benefits for IT operations:
- Automated installation
- OpenShift 4 supports an innovative installation approach that is automated, reliable, and repeatable.12 Additionally, the OpenShift 4 installation process supports full stack automated deployments and can handle installing the complete infrastructure, including components like DNS and the VM.
- Automated operating system and OpenShift platform updates
- OpenShift is tightly integrated with the lightweight RHEL CoreOS operating system, which itself is optimized for running OpenShift and cloud native applications. Thanks to the tight coupling of OpenShift with a specific version of RHEL CoreOS, the OpenShift platform is able to manage updating the operating system as part of its cluster management operations. The key value of this approach for IT operations is that it supports automated, self-managing, over-the-air updates. This enables OpenShift to support cloud native and hands-free operations.
- Automated cluster size management
- OpenShift supports the ability to automatically increase or decrease the size of the cluster it is managing. Like all Kubernetes clusters, an OpenShift cluster has a certain number of worker nodes on which the container applications are deployed. In a typical Kubernetes cluster, adding worker nodes is an out-of-band operation that IT operations must handle manually. In contrast, OpenShift provides a component called the machine operator that is capable of automatically adding worker nodes to a cluster. An IT operator can use a
MachineSetobject to declare the number of machines needed by the cluster, and OpenShift will automatically perform the provisioning and installation of new worker nodes to achieve the desired state.
- Automated cluster version management
- OpenShift, like all Kubernetes distributions, is composed of a large number of components. Each of these components has its own version number. To manage updating each of these components, OpenShift relies on a Kubernetes innovation called the operator construct. OpenShift uses a cluster version number to identify which version of OpenShift is running, and this cluster version number denotes which versions of the individual OpenShift platform components need to be installed. With its automated cluster version management, OpenShift is able to install the proper versions of all these components automatically to ensure that it is properly updated when the cluster is updated to a new version.
- Multicloud management support
- Many enterprise customers that use OpenShift have multiple clusters, and these clusters are deployed across multiple clouds or in multiple data centers. To simplify the management of multiple clusters, OpenShift 4 has introduced a new unified cloud console that allows customers to view and manage multiple OpenShift clusters.13
As we will see later in this book, OpenShift and the capabilities it provides become extremely prominent when it’s time to run in production and IT operators need to address operational and security-related concerns.
In this chapter, we gave an overview of Kubernetes and OpenShift, including the historical origins of both platforms. We then presented the key benefits provided by both Kubernetes and OpenShift that have driven the huge growth in popularity of these platforms. This has helped us have a greater appreciation for the value that Kubernetes and OpenShift provide to cloud native application developers and IT operations teams. Thus, it is no surprise that these platforms are experiencing explosive growth across a variety of industries. In Chapter 2, we will build a solid foundational overview of Kubernetes and OpenShift that presents the Kubernetes architecture, discusses how to get Kubernetes and OpenShift production environments up and running, and introduces several key Kubernetes and OpenShift concepts that are critical to running successfully in production.
13 Fernandes, “Introducing Red Hat OpenShift 4: Kubernetes for the Enterprise.”