Chapter 4. Kubernetes: The Grand Orchestrator

As monolithic applications were broken down into microservices, containers became the de facto housing for these microservices. Microservices are a cloud native architectural approach in which a single application is composed of many smaller, loosely coupled, and independent deployable components or services. Containers ensure that software runs correctly when moved between different environments. Through containers, microservices work in unison with other microservices to form a fully functioning application.

While breaking monolithic applications into smaller services solves one problem, it creates bigger problems in terms of managing and maintaining an application without significant downtime, networking the various microservices, distributed storage, and so on. Containers help by decoupling applications into fast-moving, smaller codebases with a focus on feature development. However, although this decoupling is clean and easy at first since there are fewer containers to manage, as the number of microservices in an application increases it becomes nearly impossible to debug, update, or even deploy the containerized microservices safely into the stack without breaking something or causing downtime.

Containerizing applications was the first big step toward creating self-healable environments with zero downtime, but this practice needed to evolve further, especially in terms of software development and delivery in cloud native environments. This led to the development of schedulers and orchestrator engines like Mesos, Docker Swarm, Nomad, and Kubernetes. In this chapter and the next, we will focus on Kubernetes due to its maturity and wide industry adoption.

Google introduced Kubernetes to the world in 2014, after spending almost a decade perfecting and learning from its internal cluster management system, Borg. In simple terms, Kubernetes is an open source container orchestration system. The term container orchestration primarily refers to the full lifecycle of managing containers in dynamic environments like the cloud, where machines (servers) come and go on an as-needed basis. A container orchestrator automates and manages a variety of tasks, such as provisioning and deployment, scheduling, resource allocation, scaling, load balancing, and monitoring container health. In 2015, Google donated Kubernetes to the Cloud Native Computing Foundation (CNCF).

Kubernetes (aka K8s) is one of the most widely adopted pieces of a cloud native solution since it takes care of deployment, scalability, and maintenance of containerized applications. Kubernetes helps site reliability engineers (SREs) and DevOps engineers run their cloud workload with resiliency while taking care of scaling and failover of applications that span multiple containers across clusters.

Why Is Kubernetes Called K8s?

Kubernetes is derived from a Greek word that means helmsman or pilot of a ship. Kubernetes is also called K8s. This numeronym was derived by replacing the eight letters between K and S (“ubernete”) with an 8.

Following are some of the key features that Kubernetes provides right out of the box:

Self-healing
One of the most prominent features of Kubernetes is the process of spinning up new containers when a container crashes.
Service discovery
In a cloud native environment, the containers move from one host to another. The process of actually figuring out how to connect to a service/application that is running in a container is referred to as service discovery. Kubernetes exposes containers automatically using DNS or the containers’ IP address.
Load balancing
To keep the deployed application in a stable state, Kubernetes load-balances and distributes the incoming traffic automatically.
Automatic deployments
Kubernetes works on a declarative syntax, which means you don’t have to worry about how to deploy applications. Rather, you tell what needs to be deployed and Kubernetes takes care of it.
Bin packing
To make the best use of compute resources, Kubernetes automatically deploys the containers on the best possible host without wasting or impairing overall availability for other containers.

In this chapter, we will dive into the major components and the underlying concepts of Kubernetes. Though this chapter doesn’t aim to teach Kubernetes in full, as there are already a wide array of books1 that cover the topic at length, we want to build a strong foundation. This hands-on approach will help you better understand the nitty-gritty of the overall environment. Let’s start with a discussion of how the Kubernetes cluster works.

Kubernetes Components

The Kubernetes cluster contains two types of node components:

Control plane
This is the governing component of the Kubernetes cluster. It ensures that a couple of important services (e.g., scheduling, starting up new Pods2) are always running. The core purpose of the control plane node is to make sure the cluster is always in a healthy and correct state.
Worker nodes
These are the compute instances that run your workload on your Kubernetes cluster, which hosts all your containers.

Figure 4-1 depicts the high-level components of Kubernetes. The lines signify the connections, such as worker nodes accepting a connection talking to the load balancer, which distributes traffic.

Figure 4-1. Kubernetes components

Now let’s take a detailed look at each component.

Control Plane

The control plane is primarily responsible for making global decisions about the Kubernetes cluster, such as detecting cluster state, scheduling Pods on nodes, and managing the lifecycle of the Pods. The Kubernetes control plane has several components, which we describe in the following subsections.

kube-apiserver (API server)

The API server is the frontend of the Kubernetes control plane and the only component with direct access to the entire Kubernetes cluster. As you saw in Figure 4-1, the API server serves as the central point for all the interactions between the worker nodes and the controller nodes. The services running in a Kubernetes cluster use the API server to communicate with one another. You can run multiple instances of the API server because it’s designed to be scalable horizontally.

Kube scheduler

The Kube scheduler is responsible for determining which worker node will run a Pod (or basic unit of work; we will explain Pods in more detail later in this chapter). The Kube scheduler talks to the API server, that determines which worker nodes are available to run the scheduled Pod in the best possible way. The scheduler looks for newly created Pods which are yet to be assigned to a node, and then finds feasible nodes as potential candidates and scores each of them based on different factors, such as node resource capacity and hardware requirement, to ensure that the correct scheduling decision is made. The node with the highest score is chosen to run the Pod. The scheduler also notifies the API server about this decision in a process referred to as binding.

Kube controller manager

Kubernetes has a core built-in feature that implements the self-healing capabilities in the cluster. This feature is called the Kube controller manager and it runs as a daemon. The controller manager executes a control loop called the reconciliation loop, which is a nonterminating loop that is responsible for the following:

  • Determining whether a node has gone down, and if so, taking action. This is done by the Node controller.

  • Maintaining the correct number of Pods. This is done by the Replication controller.

  • Joining the endpoint objects (i.e., servicers and Pods). This is done by the Endpoint controller.

  • Ensuring that default accounts and endpoints are created for new namespaces. This is done by the Service Account and Token controllers.

The reconciliation loop is the driving force behind the self-healing capability of Kubernetes. Kubernetes determines the state of the cluster and its objects by continuously running the following steps in a loop:

  1. Fetch the user-declared state (the desired state).

  2. Observe the state of the cluster.

  3. Compare the observed and desired states to find differences.

  4. Take actions based on the observed state.

etcd

Kubernetes uses etcd as a data store. etcd is a key-value store that is responsible for persisting all the Kubernetes objects. It was originally created by the CoreOS team and is now managed by CNCF. It is usually spun up in a highly available setup, and the etcd nodes are hosted on separate instances.

Worker Nodes

A Kubernetes cluster contains a set of worker machines called worker nodes that run the containerized applications. The control plane manages the worker nodes and the Pods in the cluster. Some components run on all Kubernetes worker nodes; they are discussed in the following subsections.

Kubelet

Kubelet is the daemon agent that runs on every node to ensure that containers are always in a running state in a Pod and are healthy. Kubelet reports to the API server about the currently available resources (CPU, memory, disk) on the worker nodes so that the API server can use the controller manager to observe the Pods’ state. Since kubelet is the agent that runs on the worker nodes, the worker nodes handle basic housekeeping tasks such as restarting containers if required and consistently conducting health checks.

Kube-proxy

Kube-proxy is the networking component that runs on each node. Kube-proxy watches all Kubernetes services3 in the cluster and ensures that when a request to a particular service is made, it gets routed to the particular virtual IP endpoint. Kube-proxy is responsible for implementing a kind of virtual IP for services.

Now that you’ve got the basics of Kubernetes components under your belt, let’s dig a bit deeper and learn more about the Kubernetes API server.

Kubernetes API Server Objects

The API server is responsible for all the communication inside and outside a Kubernetes cluster, and it exposes a RESTful HTTP API. The API server, at a fundamental level, allows you to query and manipulate Kubernetes objects. In simple terms, Kubernetes objects are stateful entities that represent the overall state of your cluster (Figure 4-2). To start working with these objects, we need to understand the fundamentals of each.

Figure 4-2. API server interaction with cluster objects

Pods

In Kubernetes, Pods are the smallest basic atomic unit. A Pod is a group of one or more containers that get deployed on the worker nodes. Kubernetes is responsible for managing the containers running inside a Pod. Containers inside a Pod will always end up on the same worker node and are tightly coupled. Since the containers inside a Pod are co-located, they run in the same context (i.e., they share the network and storage). This shared context is characteristic of Linux namespaces, cgroups, and any other aspects that maintain isolation (as we explained in Chapter 3). Pods also get a unique IP address.

In typical scenarios, a single container is run inside a Pod, but in some instances, multiple containers need to work together in a Pod. This latter setup is usually referred to as a sidecar container. One of the most common examples of running a sidecar container is running a logging container for your application that will ship your logs to external storage, such as an ELK (Elasticsearch, Logstash, and Kibana) server, in case your application Pod crashes or a Pod is deleted. Pods are also smart in that if a process in a container dies, Kubernetes will instantly restart it based on the health checks defined at the application level.

Another characteristic of Pods is that they allow horizontal scaling by replication implemented through ReplicaSets. This means that if you want your application to scale horizontally, you should create more Pods by using ReplicaSets.

Pods are also ephemeral in nature, which means that if a Pod gets killed, it will be moved and restarted on a different host. This is also accomplished by using ReplicaSets.

ReplicaSets

Reliability is a chief characteristic of Kubernetes, and since no one would be running a single instance of a Pod, redundancy becomes important. A ReplicaSet is a Kubernetes object that ensures that a stable set of replica Pods are running to maintain a self-healing cluster. All of this is achieved by the reconciliation loop, which keeps running in the background to observe the overall state of the Kubernetes cluster. ReplicaSets use the reconciliation loop and ensure that if any of the Pods crash or get restarted, a new Pod will be started in order to maintain the desired state of replication. In general, you should not directly deal with ReplicaSets, but rather, use the Deployment object, which ensures zero-downtime updates to your application and has a declarative approach toward managing and operating Kubernetes.

Deployments

Kubernetes is primarily a declarative syntax-focused orchestrator. This means that in order to roll out new features, you need to tell Kubernetes what you need to do and it’s up to Kubernetes to figure out how to perform that operation in a safe manner. One of the objects that Kubernetes offers in order to make the release of new versions of applications smoother is Deployment. If you go on manually updating Pods, you will need to restart the Pods, which will cause downtime. While a ReplicaSet knows how to maintain the desired number of Pods, it won’t do a zero-downtime upgrade. Here is where the Deployment object comes into the picture, as it helps roll out changes to Pods with zero downtime by keeping a predefined number of Pods active all the time before a new updated Pod is rolled out.

Services

To expose an application that is running inside a Pod, Kubernetes offers an object called Service. Since Kubernetes is a very dynamic system, it is necessary to ensure that applications are talking to correct backends. Pods are short-lived processes in the Kubernetes world, as they are frequently created or destroyed. Pods are coupled with a unique IP address, which means that if you rely on just the IP address of Pods, you will most likely end up with a broken service when a Pod dies, as the Pod will get a different IP address after restart, even though you have ReplicaSets running. The Service object offers an abstraction by defining a logical set of Pods and a policy by which to access them. Each Service gets a stable IP address and a DNS name that can be used to access the Pods. You can declaratively define the services that front your Pods and use the Label-Selector to access the service.

Namespaces

Since multiple teams and projects are deployed in a production-grade environment, it becomes necessary to organize the Kubernetes objects. In a simple sense, namespaces are virtual clusters separated by logical partitioning; that is, you can group your resources, such as deployments, Pods, and so on, based on logical partitions. Some people like to think of namespaces as directories to separate names. Every object in your cluster has a name that is unique to that particular type of resource, and similarly every object has a UID that is unique across the whole cluster. Namespaces also allow you to divide the cluster resources among multiple users by setting resource quotas.

Labels and Selectors

As you start using Kubernetes and creating objects, you’ll realize the need to identify or mark your Kubernetes resources in order to group them into logical entities. Kubernetes offers labels to identify metadata for objects, which easily allows you to group and operate the resources. Labels are key-value pairs that can be attached directly to objects like Pods, namespaces, DaemonSets, and so on. You can add labels at any time and modify them as you like. To find or identify your Kubernetes resources, you can query the labels using label selectors. For example, a label for a type of application tier can be:

"tier" : "frontend", "tier" : "backend", "tier" : "midtier"

Annotations

Annotations are also key-value pairs, but unlike labels, which are used to identify objects, annotations are used to hold nonidentifying information about the object itself. For example, build, release, or image information such as timestamps, release IDs, the Git branch, PR numbers, image hashes, and the registry address can be recorded in an annotation.

Ingress Controller

For your services to receive traffic from the internet, you need to expose HTTP and HTTPS endpoints from the outside to Kubernetes services running on Pods. An ingress allows you to expose your services running inside the cluster to the outside world by offering load balancing with Secure Sockets Layer/Transport Layer Security (SSL/TLS) terminations using name-based virtual hosting. To support an ingress, you should first choose an ingress controller, which is similar to a reverse proxy, to accept incoming connections for HTTP and HTTPS.

StatefulSets

To manage and scale your stateful workloads on Kubernetes, you need to ensure that the Pods are stable (i.e., a stable network and stable storage). StatefulSets ensure the ordering of the Pods and maintain their uniqueness (unlike ReplicaSets). A StatefulSet is a controller that helps you deploy groups of Pods that remain resilient to restarts and reschedules. Each Pod in a StatefulSet has unique naming conventions whose ordinal value starts at 0 and has a stable network ID associated with it (unlike a ReplicaSet, in which the naming convention is random in nature).

DaemonSets

In regular environments, we run a number of daemon services and agents on the host, including logging agents and monitoring agents. Kubernetes allows you to install such agents by running a copy of a Pod across a set of nodes in a cluster using DaemonSets. Just like ReplicaSets, DaemonSets are long-running processes that ensure that the desired state and observed state remain the same. Deleting a DaemonSet also deletes the Pods that it previously created.

Jobs

Jobs in the Kubernetes world are short-lived entities, which can be small tasks such as running a standalone script, for example. Jobs eventually create Pods. Jobs are run until they are successfully terminated, which is the major difference between a Pod that controls a Job and a regular Pod that will keep getting restarted and rescheduled if terminated. If a Job Pod fails before completion, the controller will create a new Pod based on the template.

Note

This chapter is a crash course in Kubernetes and just scratches the surface of container orchestration. Other resources are available that can help you learn more about Kubernetes. Some of the ones we recommend are:

  • Managing Kubernetes by Brendan Burns and Craig Tracey (O’Reilly, 2018)

  • Kubernetes: Up and Running, 2nd Edition by Brendan Burns, Joe Beda, and Kelsey Hightower (O’Reilly 2019)

  • “Introduction to Kubernetes”, a free course from LinuxFoundationX

  • Cloud Native DevOps with Kubernetes, 2nd Edition by John Arundel and Justin Domingus (O’Reilly, 2022)

Now that we have covered some basic terminology in the Kubernetes world, let’s take a look at the operational details for managing the cluster.

Observe, Operate, and Manage Kubernetes Clusters with kubectl

One of the more common ways to interact with a container orchestrator is by using either a command-line tool or a graphical tool. In Kubernetes you can interact with the cluster in both ways, but the preferred way is the command line. Kubernetes offers kubectl as the CLI. kubectl is widely used to administer the cluster. You can consider kubectl to be a Swiss Army knife of various Kubernetes functions that enable you to deploy and manage applications. If you are an administrator for your Kubernetes cluster, you would be using the kubectl command extensively to manage the cluster. kubectl offers a variety of commands, including:

  • Resource configuration commands, which are declarative in nature

  • Debugging commands for getting information about workloads

  • Debugging commands for manipulating and interacting with Pods

  • General cluster management commands

In this section, we will explore Kubernetes in depth by focusing on basic cluster commands for managing Kubernetes clusters, Pods, and other objects with the help of kubectl.

General Cluster Information and Commands

The first step to interacting with the Kubernetes cluster is to learn how to gain insights on the cluster, infrastructure, and Kubernetes components that you’ll be working with. To see the worker nodes that run the workloads in your cluster, issue the following kubectl command:

$  ~ kubectl get nodes
NAME      STATUS   ROLES    AGE   VERSION
worker0   Ready    <none>   11d   v1.17.3
worker1   Ready    <none>   11d   v1.17.3
worker2   Ready    <none>   11d   v1.17.3

This will list your node resources and their status, along with version information. You can gain a bit more information by using the -o wide flag in the get nodes command as follows:

$  ~ kubectl get nodes -o wide
NAME      STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             
worker0   Ready    <none>   11d   v1.17.3   10.240.0.20   <none>        
worker1   Ready    <none>   11d   v1.17.3   10.240.0.21   <none>        
worker2   Ready    <none>   11d   v1.17.3   10.240.0.22   <none>        

KERNEL-VERSION      CONTAINER-RUNTIME
Ubuntu 16.04.7 LTS   4.15.0-1092-azure   containerd://1.3.2
Ubuntu 16.04.7 LTS   4.15.0-1092-azure   containerd://1.3.2
Ubuntu 16.04.7 LTS   4.15.0-1092-azure   containerd://1.3.2

To get even more details specific to a resource or worker, you can use the describe command as follows:

$ ~ kubectl describe nodes worker0

The kubectl describe command is a very useful debugging command. You could possibly use it to gain in-depth information about Pods and other resources.

The get command can be used to gain more information about Pods, services, replication controllers, and so on. For example, to get information on all Pods, you can run:

$  ~ kubectl get pods
NAME                       READY   STATUS              RESTARTS   AGE
busybox-56d8458597-xpjcg   0/1     ContainerCreating   0          1m

To get a list of all the namespaces in your cluster, you can use get as follows:

$  ~ kubectl get namespace
NAME                   STATUS   AGE
default                Active   12d
kube-node-lease        Active   12d
kube-public            Active   12d
kube-system            Active   12d
kubernetes-dashboard   Active   46h

By default, kubectl interacts with the default namespace. To use a different namespace, you can pass the --namespace flag to reference objects in a namespace:

$  ~ kubectl get pods --namespace=default
NAME                       READY   STATUS    RESTARTS   AGE
busybox-56d8458597-xpjcg   1/1     Running   0          47h

$  ~ kubectl get pods --namespace=kube-public
No resources found in kube-public namespace.

$  ~ kubectl get pods --namespace=kubernetes-dashboard
NAME                                         READY  STATUS    RESTARTS   AGE
dashboard-metrics-scraper-779f5454cb-gq82q   1/1    Running   0          2m
kubernetes-dashboard-857bb4c778-qxsnj        1/1    Running   0          46h

To view the overall cluster details, you can make use of cluster-info as follows:

$  ~ kubectl cluster-info
Kubernetes master is running at https://40.70.3.6:6443
CoreDNS is running at https://40.70.3.6:6443/api/v1/namespaces/kube-system/services/ \
  kube-dns:dns/proxy

The cluster-info command gets you the details of the API load balancer where the control plane is sitting, along with other components.

In Kubernetes, to maintain a connection with a specific cluster you use a context. A context helps group access parameters under one name. Each context contains a Kubernetes cluster, a user, and a namespace. The current context is the cluster that is currently the default for kubectl, and all the commands being issued by kubectl run against this cluster. You can view your current context as follows:

$  ~ kubectl config current-context
cloud-native-azure

To change the default namespace, you can use a context that gets registered to your environment’s kubectl kubeconfig file. The kubeconfig file is the actual file that tells kubectl how to find your Kubernetes cluster and then authenticate based on the secrets that have been configured. The file is usually stored in your home directory under .kube/. To create and use a new context with a different namespace as its default, you can do the following:

$  ~ kubectl config set-context test --namespace=mystuff
Context "test" created.
$  ~ kubectl config use-context test
Switched to context "test"

Make sure you actually have the context (the test Kubernetes cluster) or else you won’t be able to really use it.

Labels, as we mentioned before, are used to organize your objects. For example, if you want to label the busybox Pod with a value called production, you can do it as follows, where environment is the label name:

$  ~ kubectl label pods busybox-56d8458597-xpjcg environment=production
pod/busybox-56d8458597-xpjcg labeled

At times, you might want to find out what is wrong with your Pods and try to debug an issue. To see a log for a Pod, you can run the logs command on a Pod name:

$  ~ kubectl get pods
NAME                       READY   STATUS              RESTARTS   AGE
busybox-56d8458597-xpjcg   0/1     ContainerCreating   0          2d
$  ~ kubectl logs busybox-56d8458597-xpjcg
Error from server (BadRequest): container "busybox" in pod "busybox-56d8458597-xpjcg" is
waiting to start: ContainerCreating

Sometimes you can have multiple containers running inside a Pod. To choose containers inside it, you can pass the -c flag.

Managing Pods

As we discussed earlier, Pods are the smallest deployable artifacts in Kubernetes. As an end user or administrator, you deal directly with Pods and not with containers. The containers are handled by Kubernetes internally, and this logic is abstracted. It is also important to remember that all the containers inside a Pod are placed on the same node. Pods also have a defined lifecycle whose states move from Pending to Running to Succeeded or Failed. 

One of the ways you can create a Pod is by using the kubectl run command as follows:

$ kubectl run <name of pod> --image=<name of the image from registry>

The kubectl run command pulls a public image from a container repository and creates a Pod. For example, you can run a hazelcast image as follows, and expose the container’s port too:

$  ~ kubectl run hazelcast --image=hazelcast/hazelcast --port=5701
deployment.apps/hazelcast created
Note

In production environments, you should never run or create Pods, because these Pods are not directly managed by Kubernetes and will not get restarted or rescheduled in case of a failure. You should use deployments as the preferred way of operating the Pods.

The kubectl run command is rich in feature sets, and you can control many Pod behaviors. For example, if you wish to run a Pod in the foreground (i.e., with an interactive terminal inside the Pod) and you don’t wish to restart it if it crashes, you may do the following:

$  ~ kubectl run -i -t busybox --image=busybox --restart=Never
If you don't see a command prompt, try pressing enter.
/ #

This command will directly log you inside the container. And since you run the Pod in interactive mode, you can check out the status of the busybox changing from Running to Completed, such as in the following:

$  ~ kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
busybox                      1/1     Running   0          52s

$  ~ kubectl get pods
NAME                         READY   STATUS      RESTARTS   AGE
busybox                      0/1     Completed   0          61s

Another way to create Pods in Kubernetes is by using the declarative syntax in Pod manifests. The Pod manifests, which should be treated with the same importance as your application code, can be written using either YAML or JSON. You can create a Pod manifest for running an Nginx Pod as follows:

apiVersion: v1
kind: pod
metadata:
  name: nginx
spec:
  containers:
    - image: nginx
      name: nginx
      ports:
        - containerPort: 80
          name: http

The Pod manifests information such as kind, spec, and other information that is sent to the Kubernetes API server to act on. You can save the Pod manifest with a YAML extension and use apply as follows:

$ kubectl apply -f nginx_pod.yaml
pod/nginx created
$ kubectl get pods
NAME                         READY   STATUS      RESTARTS   AGE
nginx                        1/1     Running     0          7s

When you run kubectl apply the Pod manifest is sent to the Kubernetes API server, which instantly schedules the Pod to run on a healthy node in the cluster. The Pod is monitored by the kubelet daemon process, and if the Pod crashes, it’s rescheduled to run on a different healthy node.

Let’s now move on and take a look at how you can implement health checks on your services in Kubernetes.

Health checks

Kubernetes offers three types of HTTP health checks, called probes, for ensuring the application is actually alive and well: liveness probes, readiness probes, and startup probes.

Liveness probe

This probe is responsible for ensuring that the application is actually healthy and fully functioning. After deployment, your application may take a few seconds before it is ready, so you can configure this probe to check a certain endpoint in your application. For example, in the following Pod manifest, a liveness probe has been used to perform an httpGet operation against the / path on port 80. The initialDelaySeconds is set to 2, which means the endpoint / will not be hit until this period has elapsed. Additionally, we have set the timeout to be 1 second and the failure threshold to be 3 consecutive probe failures. The periodSeconds is defined as how frequently Kubernetes will be calling the Pods. In this case, it’s 15 seconds.

apiVersion: v1
kind: Pod
metadata:
  name: mytest-pod
spec:
  containers:
  - image: test_image
    imagePullPolicy: IfNotPresent
    name: mytest-container
    command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600']
    ports:
    - name: liveness-port
      containerPort: 80
      hostPort: 8080
    livenessProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 2
      timeoutSeconds: 1
      periodSeconds: 15
      failureThreshold: 3

Readiness probe

The readiness probe’s responsibility is to identify when a container is ready to serve the user request. A readiness probe helps Kubernetes by not adding an unready Pod’s endpoint to a load balancer too early. A readiness probe can be configured simultaneously with a liveness probe block in the Pod manifest as follows:

  containers:
  - name: test_image
    image: test_image
    command: ["/bin/sh"]
    args: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600']
    readinessProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 5
      periodSeconds: 5

Startup probe

Sometimes an application requires some additional startup time on its first initialization. In such cases, you can set up a startup probe with an HTTP or TCP check that has a failureThreshold * periodSeconds, which is long enough to cover the worst-case startup time:

startupProbe:
  httpGet:
    path: /healthapi
    port: liveness-port
  failureThreshold: 30
  periodSeconds: 10

Resource limits

When you deal with Pods, you can specify the resources your application will need. Some of the most basic resource requirements for a Pod to run are CPU and memory. Although there are more types of resources that Kubernetes can handle, we will keep it simple here and discuss only CPU and memory.

You can declare two parameters in the Pod manifest: request and limit. In the request block you tell Kubernetes the minimum resource requirement for your application to operate, and in the limit block you tell Kubernetes the maximum threshold. If your application breaches the threshold in the limit block, it will be terminated or restarted and the Pod will be mostly evicted. For example, in the following Pod manifest, we are placing a max CPU limit of 500m4 and a max memory limit of 206Mi,5 which means if these values are crossed, the Pod will be evicted and rescheduled on some other node:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
    - image: nginx
      name: nginx
      ports:
        - containerPort: 80
          name: http
      resources:
              requests:
                cpu: "100m"
                memory: "108Mi"
              limits:
                cpu: "500m"
                memory: "206Mi"

Volumes

Some applications require data to be stored permanently, but because Pods are short-lived entities and are frequently restarted or killed on the fly, all data associated with a Pod can be destroyed as well. Volumes solve this problem with an abstraction layer of storage disks in Azure. A volume is a way to store, retrieve, and persist data across Pods throughout the application lifecycle. If your application is stateful, you need to use a volume to persist your data. Azure provides Azure Disk and Azure Files to create the data volumes that provide this functionality.

Persistent Volume Claim (PVC)

The PVC serves as an abstraction layer between the Pod and storage. In Kubernetes, the Pod mounts volumes with the help of PVCs and the PVCs talk to the underlying resources. One thing to note is that a PVC lets you consume the abstracted underlying storage resource; that is, it lets you claim a piece of preprovisioned storage. The PVC defines the disk size and disk type and then mounts the real storage to the Pod; this binding process can be static, as in a persistent volume (PV), or dynamic, as in a Storage class. For example, in the following manifest we are claiming a PersistentVolumeClaim having storage of 1Gi:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: persistent-volume-claim-app1
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: azurefilestorage

Persistent volume—static

A cluster administrator, usually the SRE or DevOps team, can create a predefined number of persistent volumes manually, which can then be used by the cluster users as they require. Persistent volume is the static provisioning method. For example, in the following manifest we are creating a PersistentVolume with a storage capacity of 2Gi:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: static-persistent-volume-app1
  labels:
    storage: azurefile
spec:
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteMany
  storageClassName: azurefilestorage
  azureFile:
    secretName: static-persistence-secret
    shareName: user-app1
    readOnly: false

Storage class—dynamic

Storage classes are dynamically provisioned volumes for the PVC (i.e., they allow storage volumes to be created on demand). Storage classes basically provide the cluster administrators a way to describe the classes of storage that can be offered. Each storage class has a provisioner that determines which volume plug-in is used for provisioning the persistent volumes.

In Azure, two kinds of provisioners determine the kind of storage that will be used: AzureFile and AzureDisk.

AzureFile can be used with ReadWriteMany access mode:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: azurefile
provisioner: kubernetes.io/azure-file
parameters:
  skuName: Standard_LRS
  location: eastus
  storageAccount: azure_storage_account_name

AzureDisk can only be used with ReadWriteOnce access mode:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/azure-disk
parameters:
  storageaccounttype: Standard_LRS
  kind: Shared

Figure 4-3 showcases the logical relationship between the various storage objects that Kubernetes offers.

Figure 4-3. Relationship between Pods, the PVC, PV, and Storage class

Lastly, you can delete a Pod by using kubectl delete pod as follows:

$  ~ kubectl delete pod nginx
pod "nginx" deleted

You should make sure that when you delete a Pod, it’s not actually controlled with a Deployment, because if it is, it will reappear. For example:

$  ~ kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
hazelcast-84bb5bb976-vrk97   1/1     Running   2          2d7h
nginx-76df748b9-rdbqq        1/1     Running   2          2d7h
$  ~ kubectl delete pod hazelcast-84bb5bb976-vrk97
pod "hazelcast-84bb5bb976-vrk97" deleted
$  ~ kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
hazelcast-84bb5bb976-rpcfh   1/1     Running   0          13s
nginx-76df748b9-rdbqq        1/1     Running   2          2d7h

The reason it spun off a new Pod is because Kubernetes observed that a state in the cluster was disturbed where the desired and observed states did not match, and hence the reconciliation loop kicked in to balance the Pods. This could happen because the Pod was actually deployed via a Deployment and would have ReplicaSets associated with it. So, in order to delete a Pod, you would need to delete the Deployment, which in turn would automatically delete the Pod. This is shown as follows:

$  ~ kubectl get deployments --all-namespaces
NAMESPACE              NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
default                hazelcast                   1/1     1            1           2d21h
default                nginx                       1/1     1            1           2d21h
kube-system            coredns                     2/2     2            2           3d7h
$  ~ kubectl delete -n default deployment hazelcast
deployment.apps "hazelcast" deleted
$  ~ kubectl get pods
NAME                         READY   STATUS        RESTARTS   AGE
hazelcast-84bb5bb976-rpcfh   0/1     Terminating   0          21m
nginx-76df748b9-rdbqq        1/1     Running       2          2d7h

Kubernetes in Production

Now that we have covered the basics of Pods, Kubernetes concepts, and how the Kubernetes cluster works, let’s take a look at how to actually tie together the Pods and become production ready in a Kubernetes cluster. In this section, we will see how the concepts discussed in the previous section empower production workloads and how applications are deployed onto Kubernetes.

ReplicaSets

As we discussed, ReplicaSets enable the self-healing capability of Kubernetes at the infrastructure level by maintaining a stable number of Pods. In the event of a failure at the infrastructure level (i.e., the nodes that hold the Pods), the ReplicaSet will reschedule the Pods to a different, healthy node. A ReplicaSet includes the following:

Selector
ReplicaSets use Pod labels to find and list the Pods running in the cluster to create replicas in case of a failure.
Number of replicas to create
This specifies how many Pods should be created.
Template
This specifies the associated data of new Pods that a ReplicaSet should create to meet the desired number of Pods.

In actual production use cases, you will need a ReplicaSet to maintain a stable set of running Pods by introducing redundancy. But you don’t have to deal with the ReplicaSet directly, since it’s an abstraction layer. Instead, you should use Deployments, which offer a much better way to deploy and manage Pods.

A ReplicaSet looks like this:

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: myapp-cnf-replicaset
  labels:
    app: cnfbook
spec:
  replicas: 3
  selector:
    matchLabels:
      app: cnfbook
  template:
    metadata:
      labels:
        app: cnfbook
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80

To create the ReplicaSet from the preceding configuration, save the manifest as nginx_Replicaset.yaml and apply it. This will create a ReplicaSet and three Pods, shown as follows:

$ kubectl apply -f nginx_Replicaset.yaml
replicaset.apps/myapp-cnf-replicaset created

$ kubectl get rs
NAME                   DESIRED   CURRENT   READY   AGE
myapp-cnf-replicaset   3         3         3       6s

$ kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
myapp-cnf-replicaset-drwp9   1/1     Running   0          11s
myapp-cnf-replicaset-pgwj8   1/1     Running   0          11s
myapp-cnf-replicaset-wqkll   1/1     Running   0          11s

Deployments

In production environments, a key thing that is constant is change. You will keep updating/changing the applications in your production environment at a very rapid pace, since the most common task you will be doing in production is rolling out new features of your application. Kubernetes offers the Deployment object as a standard for doing rolling updates, and provides a seamless experience to the cluster administrator as well as end users. This means you can push updates to your applications without taking down your applications, since Deployment ensures that only a certain number of Pods are down while they are being updated. These are known as zero downtime deployments. By default, Deployments ensure that at least 75% of the desired number of Pods are up. Deployments are the reliable, safe, and current way to roll out new versions of applications with zero downtime in Kubernetes.

You can create a Deployment as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80

To apply the preceding configuration, save it in a file named nginx_Deployment.yaml and then use kubectl apply as follows:

$  kubectl apply -f nginx_Deployment.yaml
deployment.apps/nginx-deployment created

Interestingly, the Deployment handles ReplicaSets as well, since we declared that we would need three replicas for the deployment Pod. We can check this as follows:

$  kubectl get deployment
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   3/3     3            3           18s
$  kubectl get rs
NAME                         DESIRED   CURRENT   READY   AGE
nginx-deployment-d46f5678b   3         3         3       29s
$  kubectl get pods
NAME                               READY   STATUS    RESTARTS   AGE
nginx-deployment-d46f5678b-dpc7p   1/1     Running   0          36s
nginx-deployment-d46f5678b-kdjxv   1/1     Running   0          36s
nginx-deployment-d46f5678b-kj8zz   1/1     Running   0          36s

So, from the manifest file, we created three replicas in the .spec.replicas file, and for the Deployment object to find the Pods being managed by nginx-deployment, we make use of the spec.selector field. The Pods are labeled using the template field via metadata.labels.

Now let’s roll out a new version of Nginx. Say we want to pin the version of Nginx to 1.14.2. We just need to edit the deployment manifest by editing the file; that is, changing the version and then saving the manifest file, as follows:

$  kubectl edit deployment.v1.apps/nginx-deployment
deployment.apps/nginx-deployment edited

This will update the Deployment object, and you can check it as follows:

$  kubectl rollout status deployment.v1.apps/nginx-deployment
Waiting for deployment "nginx-deployment" rollout to finish: 1 out of 3 new replicas have 
been updated...
Waiting for deployment "nginx-deployment" rollout to finish: 1 out of 3 new replicas have 
been updated...
Waiting for deployment "nginx-deployment" rollout to finish: 2 out of 3 new replicas have 
been updated...
Waiting for deployment "nginx-deployment" rollout to finish: 2 out of 3 new replicas have 
been updated...
Waiting for deployment "nginx-deployment" rollout to finish: 2 out of 3 new replicas have 
been updated...
Waiting for deployment "nginx-deployment" rollout to finish: 2 old replicas are pending 
termination...
Waiting for deployment "nginx-deployment" rollout to finish: 1 old replicas are pending 
termination...
Waiting for deployment "nginx-deployment" rollout to finish: 1 old replicas are pending 
termination...
deployment "nginx-deployment" successfully rolled out

The Deployment object ensures that a certain number of Pods are always available and serving while the older Pods are updated. As we already mentioned, by default, no more than 25% of Pods are unavailable while an update is being performed. Deployment ensures a max surge percentage of 25%, which ensures that only a certain number of Pods are created over the desired number of Pods. So, from the rollout status, you can clearly see that at least two Pods are available at all times while rolling out a change. You can get the details for your deployment again by using kubectl describe deployment.

Note

We directly issued a deployment command using kubectl edit, but the preferred approach is to always update the actual manifest file and then apply kubectl. This also helps you keep your deployment manifests in version control. You can use the --record or set command to update as well.

Horizontal Pod Autoscaler

Kubernetes supports dynamic scaling through the use of the Horizontal Pod Autoscaler (HPA), where the Pods are scaled horizontally; that is, you can create n number of Pods based on observed metrics for your Pod if, for example, you want to increase or decrease the number of Pods dynamically based on the CPU metric being observed. The HPA works via a control loop which, at an interval of 15 seconds by default, checks the resource utilization specified against the metrics (see Figure 4-4).

Figure 4-4. Horizontal Pod Autoscaler at work

It is important to note that to use the HPA, we need metrics-server, which collects metrics from kubelets and exposes them in the Kubernetes API server through the metrics API for use by the HPA. We first create an autoscaler using kubectl autoscale for our deployment, as follows:

$  ~ kubectl autoscale deployment nginx-deployment --cpu-percent=50 --min=3 --max=10
horizontalpodautoscaler.autoscaling/nginx-deployment autoscaled

The preceding kubectl command will create an HPA that will ensure that no fewer than three and no more than 10 Pods are used in our nginx-deployment. The HPA will increase or decrease the number of replicas to maintain an average CPU utilization across all Pods of no more than 50%.

You can view your HPA using the following:

$  ~ kubectl get hpa
NAME               REFERENCE                    TARGETS  MINPODS  MAXPODS  REPLICAS  AGE
nginx-deployment   Deployment/nginx-deployment  0%/50%   3        10       3         6m59s

We can add the scaling factor on which we can create the autoscaler with the help of an HPA manifest YAML file that is tied to the deployment.

Service

As we mentioned earlier, the Kubernetes environment is a very dynamic system where Pods are created, destroyed, and moved at a varying pace. This dynamic environment also opens a door to a well-known problem: finding the replica Pods where an application is residing, since multiple Pods are running for a deployment. Pods also need a way to find the other Pods in order to communicate. Kubernetes offers the Service object as an abstraction for a single point of entry to the group of Pods. The Service object has an IP address, a DNS name, and a port that never changes as long as the object exists. Formally known as service discovery, this feature basically helps other Pods/services reach other services in Kubernetes without dealing with the underlying complexity. We will discuss the cloud native service discovery approach in more detail in Chapter 6.

To use a Service, you can use the service manifest YAML. Suppose you have a simple “hello world” application already running as a Deployment. One default technique to expose this service is to specify the service type as ClusterIP. This service will be exposed on the cluster’s internal IP and will only be reachable from within the cluster, as follows:

---
apiVersion: v1
kind: Service
metadata:
  name: hello-world-service
spec:
  type: ClusterIP
  selector:
    app: hello-world
  ports:
  - port: 8080
    targetPort: 8080

The port represents the port where the service will be available and the targetPort is the actual container port where the service will be forwarded. In this case, we have exposed port 8080 of the hello-world app on target port 8080 against the Pod IP:

$  kubectl get svc
NAME                  TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)        AGE
hello-world-service   ClusterIP   10.32.0.40   <none>        8080/TCP       6s

The IP address shown is the cluster IP and not the actual machine IP address. If you SSH on a worker node, you can check whether the service was exposed on port 8080 by simply doing a curl as follows:

ubuntu@worker0:~$ curl http://10.32.0.40:8080
Hello World

If you try to reach this IP address from outside the cluster (i.e., any other node apart from workers), you won’t be able to connect to it. So, you can use NodePort, which exposes a service on each node’s IP at a defined port as follows:

---
apiVersion: v1
kind: Service
metadata:
  name: hello-world-node-service
spec:
  type: NodePort
  selector:
    app: hello-world
  ports:
  - port: 8080
    targetPort: 8080
    nodePort: 30767

In the service manifest, we have mapped Pod port 8080 to NodePort (i.e., physical instance port) 30767. In this way, you can expose the IP directly or place a load balancer of your choice. If you now do a get svc, you can see the mapping of ports as follows:

$  kubectl get svc
NAME                       TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)          AGE
hello-world-node-service   NodePort    10.32.0.208   <none>        8080:30767/TCP   8s
hello-world-service        ClusterIP   10.32.0.40    <none>        8080/TCP         41m

Now we can access the service on the node at port 30767:

ubuntu@controller0:~$ curl http://10.240.0.21:30767
Hello World

The IP in the curl command is the worker node’s physical IP (not the cluster IP) address, and the port that is being exposed is 30767. You can even directly hit the public IP of the node for port 30767. Figure 4-5 showcases how the cluster IP, node port, and load balancer relate to one another.

Figure 4-5. Cluster IP, node port, and load balancer in a Kubernetes node

Other types of services include LoadBalancer and ExternalName. LoadBalancer exposes the Service externally using a cloud provider’s load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created, while ExternalName maps a service to a DNS name like my.redisdb.internal.com. In Figure 4-6, you can see how different service types are related to one another.

Figure 4-6. External service and LoadBalancer in a cloud environment

Ingress

The Service object helps expose the application both inside and outside the cluster, but in production systems we cannot afford to keep opening new and unique ports for all the services we deploy using NodePort, nor we can create a new load balancer every time we choose the service type to be LoadBalancer. At times we need to deploy an HTTP-based service and also perform SSL offloading, and the Service object doesn’t really help us in those instances. In Kubernetes, HTTP load balancing (or, formally, Layer 7 load balancing) is performed by the Ingress object. 

To work with Ingress, we first need to configure an ingress controller.6 We will configure an Azure Kubernetes Service (AKS) Application Gateway Ingress Controller to understand this better and see how it behaves, but in general, one of the easiest ways to understand it better is by looking at the following:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-wildcard-host
spec:
  rules:
  - host: "foo.bar.com"
    http:
      paths:
      - pathType: Prefix
        path: "/bar"
        backend:
          service:
            name: hello-world-node-service
            port:
              number: 8080
  - host: "*.foo.com"
    http:
      paths:
      - pathType: Prefix
        path: "/foo"
        backend:
          service:
            name: service2
            port:
              number: 80

In the ingress manifest, we have created two rules and mapped foo.bar.com as a host to one subdomain /bar, which routes to our previous service hello-world-node-service.

Similarly, we can have multiple paths defined for a domain and route them to other services. There are various ways to configure the ingress according to your needs, which might span from a single domain routing to multiple services or multiple domains routing to multiple domains (see Figure 4-7).

Figure 4-7. Ingress with multiple paths for a domain

Lastly, you can specify TLS support by creating a Secret object and then using that secret in your ingress spec as follows:

apiVersion: v1
kind: Secret
metadata:
  name: my_tls_secret
  namespace: default
data:
  tls.crt: base64 encoded cert
  tls.key: base64 encoded key
type: kubernetes.io/tls

You can secure the ingress by specifying just the Base64-encoded TLS certificate and the key. When you reference this secret in your ingress manifest it will appear as follows:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: tls-example-ingress
spec:
  tls:
  - hosts:
      - https.mywebsite.example.come
    secretName: my_tls_secret
  rules:
  - host: https.mywebsite.example.come
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: service1
            port:
              number: 80

Ingress controllers deal with many features and complexities, and the Azure Gateway Ingress Controller handles a lot of them; for example, leveraging Azure’s native Layer 7 application gateway load balancer to expose your services to the internet. We will discuss this in more detail when we introduce AKS in Chapter 5.

DaemonSet

DaemonSets, as we discussed earlier, are typically used to run an agent across a number of nodes in the Kubernetes cluster. The agent is run inside a container abstracted by Pods. Most of the time, SREs and DevOps engineers prefer to run a log agent or a monitoring agent on each node to gain application telemetry and events. By default, a DaemonSet creates a copy of a Pod on every node, though this can be restricted as well using a node selector. There are a lot of similarities between ReplicaSets and DaemonSets, but the key distinction between them is the requirement (with DaemonSets) of running a single agent (i.e., Pod) application across all of your nodes.

One of the ways to run a logging container is by deploying a Pod in each node using a DaemonSet. Fluentd is an open source logging solution widely used to collect logs from systems. This example shows one of the ways to deploy Fluentd using a DaemonSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
  labels:
    central-log-k8s: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

In the preceding DaemonSet configuration, we created a DaemonSet that will deploy the Fluentd container on each node. We have also created a toleration, which does not schedule the Fluentd node on the master (control plan) nodes.

Note

Kubernetes offers scheduling features called taints and tolerations:

  • Taints in Kubernetes allow a node to repel a set of Pods (i.e., if you want certain nodes not to schedule some type of Pod).

  • Tolerations are applied to Pods, and allow (but do not require) the Pods to schedule onto nodes with matching taints.

Taints and tolerations work together to ensure that Pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this indicates that the node should not accept any Pods that do not tolerate the taints.

You can check the Pods that were created automatically for each worker node as follows:

$  kubectl apply -f fluentd.yaml
daemonset.apps/fluentd-elasticsearch created

$ kubectl get ds --all-namespaces
NAMESPACE        NAME                    DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   
kube-system      fluentd-elasticsearch   3         3         3       3            3           
NODE SELECTOR   AGE
<none>    5m12s

$  kubectl get pods --namespace=kube-system -o wide
NAME                          READY   STATUS    RESTARTS   AGE   IP            NODE      
fluentd-elasticsearch-5jxg7   1/1     Running   1          45m   10.200.1.53   worker1   
fluentd-elasticsearch-c5s4c   1/1     Running   1          45m   10.200.2.32   worker2   
fluentd-elasticsearch-r4pqz   1/1     Running   1          45m   10.200.0.43   worker0   
NOMINATED NODE   READINESS GATES
<none>           <none>
<none>           <none>
<none>           <none>

Jobs

Sometimes we need to run a small script until it can successfully terminate. Kubernetes allows you to do this with the Job object. The Job object creates and manages Pods that will run until successful completion, and unlike regular Pods, once the given task is completed, these Job-created Pods are not restarted. You can use a Job YAML to describe a simple Job as follows:

apiVersion: batch/v1
kind: Job
metadata:
  name: examplejob
spec:
  template:
    metadata:
      name: examplejob
    spec:
        containers:
        - name: examplejob
          image: busybox
          command: ["echo", "Cloud Native with Azure"]
        restartPolicy: Never

Here we created a Job to print a shell command. To create a Job, you can save the preceding YAML and apply it using kubectl apply. Once you apply the job manifest, Kubernetes will create the job and immediately run it. You can check the status of the job using kubectl describe as follows:

$  kubectl apply -f job.yaml
job.batch/examplejob created
$  kubectl get jobs
NAME         COMPLETIONS   DURATION   AGE
examplejob   1/1           3s         9s
$  kubectl describe job examplejob
Name:           examplejob
Namespace:      default
Selector:       controller-uid=f6887706-85ef-4752-8911-79cc7ab33886
Labels:         controller-uid=f6887706-85ef-4752-8911-79cc7ab33886
                job-name=examplejob
Annotations:    kubectl.kubernetes.io/last-applied-configuration:
                  {"apiVersion":"batch/v1","kind":"Job","metadata": 
                  {"annotations":{},"name":"examplejob","namespace":"default"},
                  "spec":{"template":{"metadat...
Parallelism:    1
Completions:    1
Start Time:     Mon, 07 Sep 2020 01:08:53 +0530
Completed At:   Mon, 07 Sep 2020 01:08:56 +0530
Duration:       3s
Pods Statuses:  0 Running / 1 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=f6887706-85ef-4752-8911-79cc7ab33886
           job-name=examplejob
  Containers:
   examplejob:
    Image:      busybox
    Port:       <none>
    Host Port:  <none>
    Command:
      echo
      Cloud Native with Azure
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  61s   job-controller  Created pod: examplejob-mcqzc
  Normal  Completed         58s   job-controller  Job completed

Summary

Kubernetes is a powerful platform that was built from a decade of experience gained by containerizing applications at scale at Google. Kubernetes basically led to the inception of the Cloud Native Computing Foundation and was the first project to graduate under it. This led to a whole lot of streamlining of the microservices ecosystem in respect to support for and higher adoption of a cloud native environment. In this chapter, we looked at the various components and concepts that allow Kubernetes to operate at scale. This chapter also sets the stage for upcoming chapters in which we will utilize the Kubernetes platform to serve production-grade cloud native applications.

Given the underlying complexity of managing a Kubernetes cluster, in Chapter 5 we will look at how to create and use such a cluster. We will also look at the Azure Kubernetes Service, and more.

1 Managing Kubernetes by Brendan Burns and Craig Tracey (O’Reilly, 2019); Kubernetes in Action by Marko Lukša (Manning, 2018).

2 Pods are the smallest deployable units of computing that you can create and manage in Kubernetes.

3 In Kubernetes, services are a way of exposing your Pods so that they can be discovered inside the Kubernetes cluster.

4 Fractional requests are allowed. For example, one CPU can be broken into two 0.5s. The expression 0.1 is equivalent to 100m.

5 Limit and request are measured in bytes. Memory can be expressed as a plain integer or a fixed-point number using the suffixes E, P, T, G, M, K, or their power-of-two equivalents Ei, Pi, Ti, Gi, Mi, Ki.

6 For the Ingress resource to work, the cluster must have an ingress controller running. Unlike other types of controllers that run as part of the kube-controller-manager binary, ingress controllers are not started automatically with a cluster. There are different types of ingress controllers; for example, Azure offers the AKS Application Gateway Ingress Controller for configuring the Azure Application Gateway.

Get Cloud Native Infrastructure with Azure now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.