Chapter 4. Kubernetes: The Grand Orchestrator
As monolithic applications were broken down into microservices, containers became the de facto housing for these microservices. Microservices are a cloud native architectural approach in which a single application is composed of many smaller, loosely coupled, and independent deployable components or services. Containers ensure that software runs correctly when moved between different environments. Through containers, microservices work in unison with other microservices to form a fully functioning application.
While breaking monolithic applications into smaller services solves one problem, it creates bigger problems in terms of managing and maintaining an application without significant downtime, networking the various microservices, distributed storage, and so on. Containers help by decoupling applications into fast-moving, smaller codebases with a focus on feature development. However, although this decoupling is clean and easy at first since there are fewer containers to manage, as the number of microservices in an application increases it becomes nearly impossible to debug, update, or even deploy the containerized microservices safely into the stack without breaking something or causing downtime.
Containerizing applications was the first big step toward creating self-healable environments with zero downtime, but this practice needed to evolve further, especially in terms of software development and delivery in cloud native environments. This led to the development of schedulers and orchestrator engines like Mesos, Docker Swarm, Nomad, and Kubernetes. In this chapter and the next, we will focus on Kubernetes due to its maturity and wide industry adoption.
Google introduced Kubernetes to the world in 2014, after spending almost a decade perfecting and learning from its internal cluster management system, Borg. In simple terms, Kubernetes is an open source container orchestration system. The term container orchestration primarily refers to the full lifecycle of managing containers in dynamic environments like the cloud, where machines (servers) come and go on an as-needed basis. A container orchestrator automates and manages a variety of tasks, such as provisioning and deployment, scheduling, resource allocation, scaling, load balancing, and monitoring container health. In 2015, Google donated Kubernetes to the Cloud Native Computing Foundation (CNCF).
Kubernetes (aka K8s) is one of the most widely adopted pieces of a cloud native solution since it takes care of deployment, scalability, and maintenance of containerized applications. Kubernetes helps site reliability engineers (SREs) and DevOps engineers run their cloud workload with resiliency while taking care of scaling and failover of applications that span multiple containers across clusters.
Why Is Kubernetes Called K8s?
Kubernetes is derived from a Greek word that means helmsman or pilot of a ship. Kubernetes is also called K8s. This numeronym was derived by replacing the eight letters between K and S (âuberneteâ) with an 8.
Following are some of the key features that Kubernetes provides right out of the box:
- Self-healing
- One of the most prominent features of Kubernetes is the process of spinning up new containers when a container crashes.
- Service discovery
- In a cloud native environment, the containers move from one host to another. The process of actually figuring out how to connect to a service/application that is running in a container is referred to as service discovery. Kubernetes exposes containers automatically using DNS or the containersâ IP address.
- Load balancing
- To keep the deployed application in a stable state, Kubernetes load-balances and distributes the incoming traffic automatically.
- Automatic deployments
- Kubernetes works on a declarative syntax, which means you donât have to worry about how to deploy applications. Rather, you tell what needs to be deployed and Kubernetes takes care of it.
- Bin packing
- To make the best use of compute resources, Kubernetes automatically deploys the containers on the best possible host without wasting or impairing overall availability for other containers.
In this chapter, we will dive into the major components and the underlying concepts of Kubernetes. Though this chapter doesnât aim to teach Kubernetes in full, as there are already a wide array of books1 that cover the topic at length, we want to build a strong foundation. This hands-on approach will help you better understand the nitty-gritty of the overall environment. Letâs start with a discussion of how the Kubernetes cluster works.
Kubernetes Components
The Kubernetes cluster contains two types of node components:
- Control plane
- This is the governing component of the Kubernetes cluster. It ensures that a couple of important services (e.g., scheduling, starting up new Pods2) are always running. The core purpose of the control plane node is to make sure the cluster is always in a healthy and correct state.
- Worker nodes
- These are the compute instances that run your workload on your Kubernetes cluster, which hosts all your containers.
Figure 4-1 depicts the high-level components of Kubernetes. The lines signify the connections, such as worker nodes accepting a connection talking to the load balancer, which distributes traffic.
Now letâs take a detailed look at each component.
Control Plane
The control plane is primarily responsible for making global decisions about the Kubernetes cluster, such as detecting cluster state, scheduling Pods on nodes, and managing the lifecycle of the Pods. The Kubernetes control plane has several components, which we describe in the following subsections.
kube-apiserver (API server)
The API server is the frontend of the Kubernetes control plane and the only component with direct access to the entire Kubernetes cluster. As you saw in Figure 4-1, the API server serves as the central point for all the interactions between the worker nodes and the controller nodes. The services running in a Kubernetes cluster use the API server to communicate with one another. You can run multiple instances of the API server because itâs designed to be scalable horizontally.
Kube scheduler
The Kube scheduler is responsible for determining which worker node will run a Pod (or basic unit of work; we will explain Pods in more detail later in this chapter). The Kube scheduler talks to the API server, that determines which worker nodes are available to run the scheduled Pod in the best possible way. The scheduler looks for newly created Pods which are yet to be assigned to a node, and then finds feasible nodes as potential candidates and scores each of them based on different factors, such as node resource capacity and hardware requirement, to ensure that the correct scheduling decision is made. The node with the highest score is chosen to run the Pod. The scheduler also notifies the API server about this decision in a process referred to as binding.
Kube controller manager
Kubernetes has a core built-in feature that implements the self-healing capabilities in the cluster. This feature is called the Kube controller manager and it runs as a daemon. The controller manager executes a control loop called the reconciliation loop, which is a nonterminating loop that is responsible for the following:
-
Determining whether a node has gone down, and if so, taking action. This is done by the Node controller.
-
Maintaining the correct number of Pods. This is done by the Replication controller.
-
Joining the endpoint objects (i.e., servicers and Pods). This is done by the Endpoint controller.
-
Ensuring that default accounts and endpoints are created for new namespaces. This is done by the Service Account and Token controllers.
The reconciliation loop is the driving force behind the self-healing capability of Kubernetes. Kubernetes determines the state of the cluster and its objects by continuously running the following steps in a loop:
-
Fetch the user-declared state (the desired state).
-
Observe the state of the cluster.
-
Compare the observed and desired states to find differences.
-
Take actions based on the observed state.
etcd
Kubernetes uses etcd as a data store. etcd is a key-value store that is responsible for persisting all the Kubernetes objects. It was originally created by the CoreOS team and is now managed by CNCF. It is usually spun up in a highly available setup, and the etcd nodes are hosted on separate instances.
Worker Nodes
A Kubernetes cluster contains a set of worker machines called worker nodes that run the containerized applications. The control plane manages the worker nodes and the Pods in the cluster. Some components run on all Kubernetes worker nodes; they are discussed in the following subsections.
Kubelet
Kubelet is the daemon agent that runs on every node to ensure that containers are always in a running state in a Pod and are healthy. Kubelet reports to the API server about the currently available resources (CPU, memory, disk) on the worker nodes so that the API server can use the controller manager to observe the Podsâ state. Since kubelet is the agent that runs on the worker nodes, the worker nodes handle basic housekeeping tasks such as restarting containers if required and consistently conducting health checks.
Kube-proxy
Kube-proxy is the networking component that runs on each node. Kube-proxy watches all Kubernetes services3 in the cluster and ensures that when a request to a particular service is made, it gets routed to the particular virtual IP endpoint. Kube-proxy is responsible for implementing a kind of virtual IP for services.
Now that youâve got the basics of Kubernetes components under your belt, letâs dig a bit deeper and learn more about the Kubernetes API server.
Kubernetes API Server Objects
The API server is responsible for all the communication inside and outside a Kubernetes cluster, and it exposes a RESTful HTTP API. The API server, at a fundamental level, allows you to query and manipulate Kubernetes objects. In simple terms, Kubernetes objects are stateful entities that represent the overall state of your cluster (Figure 4-2). To start working with these objects, we need to understand the fundamentals of each.
Pods
In Kubernetes, Pods are the smallest basic atomic unit. A Pod is a group of one or more containers that get deployed on the worker nodes. Kubernetes is responsible for managing the containers running inside a Pod. Containers inside a Pod will always end up on the same worker node and are tightly coupled. Since the containers inside a Pod are co-located, they run in the same context (i.e., they share the network and storage). This shared context is characteristic of Linux namespaces, cgroups, and any other aspects that maintain isolation (as we explained in Chapter 3). Pods also get a unique IP address.
In typical scenarios, a single container is run inside a Pod, but in some instances, multiple containers need to work together in a Pod. This latter setup is usually referred to as a sidecar container. One of the most common examples of running a sidecar container is running a logging container for your application that will ship your logs to external storage, such as an ELK (Elasticsearch, Logstash, and Kibana) server, in case your application Pod crashes or a Pod is deleted. Pods are also smart in that if a process in a container dies, Kubernetes will instantly restart it based on the health checks defined at the application level.
Another characteristic of Pods is that they allow horizontal scaling by replication implemented through ReplicaSets. This means that if you want your application to scale horizontally, you should create more Pods by using ReplicaSets.
Pods are also ephemeral in nature, which means that if a Pod gets killed, it will be moved and restarted on a different host. This is also accomplished by using ReplicaSets.
ReplicaSets
Reliability is a chief characteristic of Kubernetes, and since no one would be running a single instance of a Pod, redundancy becomes important. A ReplicaSet is a Kubernetes object that ensures that a stable set of replica Pods are running to maintain a self-healing cluster. All of this is achieved by the reconciliation loop, which keeps running in the background to observe the overall state of the Kubernetes cluster. ReplicaSets use the reconciliation loop and ensure that if any of the Pods crash or get restarted, a new Pod will be started in order to maintain the desired state of replication. In general, you should not directly deal with ReplicaSets, but rather, use the Deployment
object, which ensures zero-downtime updates to your application and has a declarative approach toward managing and operating Kubernetes.
Deployments
Kubernetes is primarily a declarative syntax-focused orchestrator. This means that in order to roll out new features, you need to tell Kubernetes what you need to do and itâs up to Kubernetes to figure out how to perform that operation in a safe manner. One of the objects that Kubernetes offers in order to make the release of new versions of applications smoother is Deployment
. If you go on manually updating Pods, you will need to restart the Pods, which will cause downtime. While a ReplicaSet knows how to maintain the desired number of Pods, it wonât do a zero-downtime upgrade. Here is where the Deployment
object comes into the picture, as it helps roll out changes to Pods with zero downtime by keeping a predefined number of Pods active all the time before a new updated Pod is rolled out.
Services
To expose an application that is running inside a Pod, Kubernetes offers an object called Service
. Since Kubernetes is a very dynamic system, it is necessary to ensure that applications are talking to correct backends. Pods are short-lived processes in the Kubernetes world, as they are frequently created or destroyed. Pods are coupled with a unique IP address, which means that if you rely on just the IP address of Pods, you will most likely end up with a broken service when a Pod dies, as the Pod will get a different IP address after restart, even though you have ReplicaSets running. The Service
object offers an abstraction by defining a logical set of Pods and a policy by which to access them. Each Service
gets a stable IP address and a DNS name that can be used to access the Pods. You can declaratively define the services that front your Pods and use the Label-Selector
to access the service.
Namespaces
Since multiple teams and projects are deployed in a production-grade environment, it becomes necessary to organize the Kubernetes objects. In a simple sense, namespaces are virtual clusters separated by logical partitioning; that is, you can group your resources, such as deployments, Pods, and so on, based on logical partitions. Some people like to think of namespaces as directories to separate names. Every object in your cluster has a name that is unique to that particular type of resource, and similarly every object has a UID that is unique across the whole cluster. Namespaces also allow you to divide the cluster resources among multiple users by setting resource quotas.
Labels and Selectors
As you start using Kubernetes and creating objects, youâll realize the need to identify or mark your Kubernetes resources in order to group them into logical entities. Kubernetes offers labels to identify metadata for objects, which easily allows you to group and operate the resources. Labels are key-value pairs that can be attached directly to objects like Pods, namespaces, DaemonSets, and so on. You can add labels at any time and modify them as you like. To find or identify your Kubernetes resources, you can query the labels using label selectors. For example, a label for a type of application tier can be:
"tier" : "frontend", "tier" : "backend", "tier" : "midtier"
Annotations
Annotations are also key-value pairs, but unlike labels, which are used to identify objects, annotations are used to hold nonidentifying information about the object itself. For example, build, release, or image information such as timestamps, release IDs, the Git branch, PR numbers, image hashes, and the registry address can be recorded in an annotation.
Ingress Controller
For your services to receive traffic from the internet, you need to expose HTTP and HTTPS endpoints from the outside to Kubernetes services running on Pods. An ingress allows you to expose your services running inside the cluster to the outside world by offering load balancing with Secure Sockets Layer/Transport Layer Security (SSL/TLS) terminations using name-based virtual hosting. To support an ingress, you should first choose an ingress controller, which is similar to a reverse proxy, to accept incoming connections for HTTP and HTTPS.
StatefulSets
To manage and scale your stateful workloads on Kubernetes, you need to ensure that the Pods are stable (i.e., a stable network and stable storage). StatefulSets ensure the ordering of the Pods and maintain their uniqueness (unlike ReplicaSets). A StatefulSet is a controller that helps you deploy groups of Pods that remain resilient to restarts and reschedules. Each Pod in a StatefulSet has unique naming conventions whose ordinal value starts at 0 and has a stable network ID associated with it (unlike a ReplicaSet, in which the naming convention is random in nature).
DaemonSets
In regular environments, we run a number of daemon services and agents on the host, including logging agents and monitoring agents. Kubernetes allows you to install such agents by running a copy of a Pod across a set of nodes in a cluster using DaemonSets. Just like ReplicaSets, DaemonSets are long-running processes that ensure that the desired state and observed state remain the same. Deleting a DaemonSet also deletes the Pods that it previously created.
Jobs
Jobs in the Kubernetes world are short-lived entities, which can be small tasks such as running a standalone script, for example. Jobs eventually create Pods. Jobs are run until they are successfully terminated, which is the major difference between a Pod that controls a Job and a regular Pod that will keep getting restarted and rescheduled if terminated. If a Job Pod fails before completion, the controller will create a new Pod based on the template.
Note
This chapter is a crash course in Kubernetes and just scratches the surface of container orchestration. Other resources are available that can help you learn more about Kubernetes. Some of the ones we recommend are:
-
Managing Kubernetes by Brendan Burns and Craig Tracey (OâReilly, 2018)
-
Kubernetes: Up and Running, 2nd Edition by Brendan Burns, Joe Beda, and Kelsey Hightower (OâReilly 2019)
-
âIntroduction to Kubernetesâ, a free course from LinuxFoundationX
-
Cloud Native DevOps with Kubernetes, 2nd Edition by John Arundel and Justin Domingus (OâReilly, 2022)
Now that we have covered some basic terminology in the Kubernetes world, letâs take a look at the operational details for managing the cluster.
Observe, Operate, and Manage Kubernetes Clusters with kubectl
One of the more common ways to interact with a container orchestrator is by using either a command-line tool or a graphical tool. In Kubernetes you can interact with the cluster in both ways, but the preferred way is the command line. Kubernetes offers kubectl as the CLI. kubectl is widely used to administer the cluster. You can consider kubectl to be a Swiss Army knife of various Kubernetes functions that enable you to deploy and manage applications. If you are an administrator for your Kubernetes cluster, you would be using the kubectl
command extensively to manage the cluster. kubectl offers a variety of commands, including:
-
Resource configuration commands, which are declarative in nature
-
Debugging commands for getting information about workloads
-
Debugging commands for manipulating and interacting with Pods
-
General cluster management commands
In this section, we will explore Kubernetes in depth by focusing on basic cluster commands for managing Kubernetes clusters, Pods, and other objects with the help of kubectl.
General Cluster Information and Commands
The first step to interacting with the Kubernetes cluster is to learn how to gain insights on the cluster, infrastructure, and Kubernetes components that youâll be working with. To see the worker nodes that run the workloads in your cluster, issue the following kubectl
command:
$ ~ kubectl get nodes NAME STATUS ROLES AGE VERSION worker0 Ready <none> 11d v1.17.3 worker1 Ready <none> 11d v1.17.3 worker2 Ready <none> 11d v1.17.3
This will list your node resources and their status, along with version information. You can gain a bit more information by using the -o wide
flag in the get nodes
command as follows:
$ ~ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE worker0 Ready <none> 11d v1.17.3 10.240.0.20 <none> worker1 Ready <none> 11d v1.17.3 10.240.0.21 <none> worker2 Ready <none> 11d v1.17.3 10.240.0.22 <none> KERNEL-VERSION CONTAINER-RUNTIME Ubuntu 16.04.7 LTS 4.15.0-1092-azure containerd://1.3.2 Ubuntu 16.04.7 LTS 4.15.0-1092-azure containerd://1.3.2 Ubuntu 16.04.7 LTS 4.15.0-1092-azure containerd://1.3.2
To get even more details specific to a resource or worker, you can use the describe
command as follows:
$ ~ kubectl describe nodes worker0
The kubectl describe
command is a very useful debugging command. You could possibly use it to gain in-depth information about Pods and other resources.
The get
command can be used to gain more information about Pods, services, replication controllers, and so on. For example, to get information on all Pods, you can run:
$ ~ kubectl get pods NAME READY STATUS RESTARTS AGE busybox-56d8458597-xpjcg 0/1 ContainerCreating 0 1m
To get a list of all the namespaces in your cluster, you can use get
as follows:
$ ~ kubectl get namespace NAME STATUS AGE default Active 12d kube-node-lease Active 12d kube-public Active 12d kube-system Active 12d kubernetes-dashboard Active 46h
By default, kubectl interacts with the default
namespace. To use a different namespace, you can pass the --namespace
flag to reference objects in a namespace:
$ ~ kubectl get pods --namespace=default NAME READY STATUS RESTARTS AGE busybox-56d8458597-xpjcg 1/1 Running 0 47h $ ~ kubectl get pods --namespace=kube-public No resources found in kube-public namespace. $ ~ kubectl get pods --namespace=kubernetes-dashboard NAME READY STATUS RESTARTS AGE dashboard-metrics-scraper-779f5454cb-gq82q 1/1 Running 0 2m kubernetes-dashboard-857bb4c778-qxsnj 1/1 Running 0 46h
To view the overall cluster details, you can make use of cluster-info
as follows:
$ ~ kubectl cluster-info Kubernetes master is running at https://40.70.3.6:6443 CoreDNS is running at https://40.70.3.6:6443/api/v1/namespaces/kube-system/services/ \ kube-dns:dns/proxy
The cluster-info
command gets you the details of the API load balancer where the control plane is sitting, along with other components.
In Kubernetes, to maintain a connection with a specific cluster you use a context. A context helps group access parameters under one name. Each context contains a Kubernetes cluster, a user, and a namespace. The current context is the cluster that is currently the default for kubectl, and all the commands being issued by kubectl run against this cluster. You can view your current context as follows:
$ ~ kubectl config current-context cloud-native-azure
To change the default namespace, you can use a context that gets registered to your environmentâs kubectl kubeconfig file. The kubeconfig file is the actual file that tells kubectl how to find your Kubernetes cluster and then authenticate based on the secrets that have been configured. The file is usually stored in your home directory under .kube/. To create and use a new context with a different namespace as its default, you can do the following:
$ ~ kubectl config set-context test --namespace=mystuff Context "test" created. $ ~ kubectl config use-context test Switched to context "test"
Make sure you actually have the context (the test Kubernetes cluster) or else you wonât be able to really use it.
Labels, as we mentioned before, are used to organize your objects. For example, if you want to label the busybox
Pod with a value called production
, you can do it as follows, where environment
is the label name:
$ ~ kubectl label pods busybox-56d8458597-xpjcg environment=production pod/busybox-56d8458597-xpjcg labeled
At times, you might want to find out what is wrong with your Pods and try to debug an issue. To see a log for a Pod, you can run the logs
command on a Pod name:
$ ~ kubectl get pods NAME READY STATUS RESTARTS AGE busybox-56d8458597-xpjcg 0/1 ContainerCreating 0 2d $ ~ kubectl logs busybox-56d8458597-xpjcg Error from server (BadRequest): container "busybox" in pod "busybox-56d8458597-xpjcg" is waiting to start: ContainerCreating
Sometimes you can have multiple containers running inside a Pod. To choose containers inside it, you can pass the -c
flag.
Managing Pods
As we discussed earlier, Pods are the smallest deployable artifacts in Kubernetes. As an end user or administrator, you deal directly with Pods and not with containers. The containers are handled by Kubernetes internally, and this logic is abstracted. It is also important to remember that all the containers inside a Pod are placed on the same node. Pods also have a defined lifecycle whose states move from Pending to Running to Succeeded or Failed.Â
One of the ways you can create a Pod is by using the kubectl run
command as follows:
$ kubectl run <name of pod> --image=<name of the image from registry>
The kubectl run
command pulls a public image from a container repository and creates a Pod. For example, you can run a hazelcast image as follows, and expose the containerâs port too:
$ ~ kubectl run hazelcast --image=hazelcast/hazelcast --port=5701 deployment.apps/hazelcast created
Note
In production environments, you should never run or create Pods, because these Pods are not directly managed by Kubernetes and will not get restarted or rescheduled in case of a failure. You should use deployments as the preferred way of operating the Pods.
The kubectl run
command is rich in feature sets, and you can control many Pod behaviors. For example, if you wish to run a Pod in the foreground (i.e., with an interactive terminal inside the Pod) and you donât wish to restart it if it crashes, you may do the following:
$ ~ kubectl run -i -t busybox --image=busybox --restart=Never If you don't see a command prompt, try pressing enter. / #
This command will directly log you inside the container. And since you run the Pod in interactive mode, you can check out the status of the busybox changing from Running to Completed, such as in the following:
$ ~ kubectl get pods NAME READY STATUS RESTARTS AGE busybox 1/1 Running 0 52s $ ~ kubectl get pods NAME READY STATUS RESTARTS AGE busybox 0/1 Completed 0 61s
Another way to create Pods in Kubernetes is by using the declarative syntax in Pod manifests. The Pod manifests, which should be treated with the same importance as your application code, can be written using either YAML or JSON. You can create a Pod manifest for running an Nginx Pod as follows:
apiVersion: v1 kind: pod metadata: name: nginx spec: containers: - image: nginx name: nginx ports: - containerPort: 80 name: http
The Pod manifests information such as kind
, spec
, and other information that is sent to the Kubernetes API server to act on. You can save the Pod manifest with a YAML extension and use apply
as follows:
$ kubectl apply -f nginx_pod.yaml pod/nginx created $ kubectl get pods NAME READY STATUS RESTARTS AGE nginx 1/1 Running 0 7s
When you run kubectl apply
the Pod manifest is sent to the Kubernetes API server, which instantly schedules the Pod to run on a healthy node in the cluster. The Pod is monitored by the kubelet daemon process, and if the Pod crashes, itâs rescheduled to run on a different healthy node.
Letâs now move on and take a look at how you can implement health checks on your services in Kubernetes.
Health checks
Kubernetes offers three types of HTTP health checks, called probes, for ensuring the application is actually alive and well: liveness probes, readiness probes, and startup probes.
Liveness probe
This probe is responsible for ensuring that the application is actually healthy and fully functioning. After deployment, your application may take a few seconds before it is ready, so you can configure this probe to check a certain endpoint in your application. For example, in the following Pod manifest, a liveness probe has been used to perform an httpGet
operation against the /
path on port 80. The initialDelaySeconds
is set to 2
, which means the endpoint /
will not be hit until this period has elapsed. Additionally, we have set the timeout to be 1
second and the failure threshold to be 3
consecutive probe failures. The periodSeconds
is defined as how frequently Kubernetes will be calling the Pods. In this case, itâs 15 seconds.
apiVersion: v1 kind: Pod metadata: name: mytest-pod spec: containers: - image: test_image imagePullPolicy: IfNotPresent name: mytest-container command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600'] ports: - name: liveness-port containerPort: 80 hostPort: 8080 livenessProbe: httpGet: path: / port: 80 initialDelaySeconds: 2 timeoutSeconds: 1 periodSeconds: 15 failureThreshold: 3
Readiness probe
The readiness probeâs responsibility is to identify when a container is ready to serve the user request. A readiness probe helps Kubernetes by not adding an unready Podâs endpoint to a load balancer too early. A readiness probe can be configured simultaneously with a liveness probe block in the Pod manifest as follows:
containers: - name: test_image image: test_image command: ["/bin/sh"] args: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600'] readinessProbe: httpGet: path: / port: 80 initialDelaySeconds: 5 periodSeconds: 5
Startup probe
Sometimes an application requires some additional startup time on its first initialization. In such cases, you can set up a startup probe with an HTTP or TCP check that has a failureThreshold * periodSeconds
, which is long enough to cover the worst-case startup time:
startupProbe: httpGet: path: /healthapi port: liveness-port failureThreshold: 30 periodSeconds: 10
Resource limits
When you deal with Pods, you can specify the resources your application will need. Some of the most basic resource requirements for a Pod to run are CPU and memory. Although there are more types of resources that Kubernetes can handle, we will keep it simple here and discuss only CPU and memory.
You can declare two parameters in the Pod manifest: request and limit. In the request block you tell Kubernetes the minimum resource requirement for your application to operate, and in the limit block you tell Kubernetes the maximum threshold. If your application breaches the threshold in the limit block, it will be terminated or restarted and the Pod will be mostly evicted. For example, in the following Pod manifest, we are placing a max CPU limit of 500m4 and a max memory limit of 206Mi,5 which means if these values are crossed, the Pod will be evicted and rescheduled on some other node:
apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - image: nginx name: nginx ports: - containerPort: 80 name: http resources: requests: cpu: "100m" memory: "108Mi" limits: cpu: "500m" memory: "206Mi"
Volumes
Some applications require data to be stored permanently, but because Pods are short-lived entities and are frequently restarted or killed on the fly, all data associated with a Pod can be destroyed as well. Volumes solve this problem with an abstraction layer of storage disks in Azure. A volume is a way to store, retrieve, and persist data across Pods throughout the application lifecycle. If your application is stateful, you need to use a volume to persist your data. Azure provides Azure Disk and Azure Files to create the data volumes that provide this functionality.
Persistent Volume Claim (PVC)
The PVC serves as an abstraction layer between the Pod and storage. In Kubernetes, the Pod mounts volumes with the help of PVCs and the PVCs talk to the underlying resources. One thing to note is that a PVC lets you consume the abstracted underlying storage resource; that is, it lets you claim a piece of preprovisioned storage. The PVC defines the disk size and disk type and then mounts the real storage to the Pod; this binding process can be static, as in a persistent volume (PV), or dynamic, as in a Storage
class. For example, in the following manifest we are claiming a PersistentVolumeClaim
having storage of 1Gi
:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: persistent-volume-claim-app1 spec: accessModes: - ReadWriteMany resources: requests: storage: 1Gi storageClassName: azurefilestorage
Persistent volumeâstatic
A cluster administrator, usually the SRE or DevOps team, can create a predefined number of persistent volumes manually, which can then be used by the cluster users as they require. Persistent volume is the static provisioning method. For example, in the following manifest we are creating a PersistentVolume
with a storage capacity of 2Gi
:
apiVersion: v1 kind: PersistentVolume metadata: name: static-persistent-volume-app1 labels: storage: azurefile spec: capacity: storage: 2Gi accessModes: - ReadWriteMany storageClassName: azurefilestorage azureFile: secretName: static-persistence-secret shareName: user-app1 readOnly: false
Storage classâdynamic
Storage classes are dynamically provisioned volumes for the PVC (i.e., they allow storage volumes to be created on demand). Storage classes basically provide the cluster administrators a way to describe the classes of storage that can be offered. Each storage class has a provisioner that determines which volume plug-in is used for provisioning the persistent volumes.
In Azure, two kinds of provisioners determine the kind of storage that will be used: AzureFile and AzureDisk.
AzureFile can be used with ReadWriteMany access mode:
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: azurefile provisioner: kubernetes.io/azure-file parameters: skuName: Standard_LRS location: eastus storageAccount: azure_storage_account_name
AzureDisk can only be used with ReadWriteOnce access mode:
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: slow provisioner: kubernetes.io/azure-disk parameters: storageaccounttype: Standard_LRS kind: Shared
Figure 4-3 showcases the logical relationship between the various storage objects that Kubernetes offers.
Lastly, you can delete a Pod by using kubectl delete pod
as follows:
$ ~ kubectl delete pod nginx pod "nginx" deleted
You should make sure that when you delete a Pod, itâs not actually controlled with a Deployment, because if it is, it will reappear. For example:
$ ~ kubectl get pods NAME READY STATUS RESTARTS AGE hazelcast-84bb5bb976-vrk97 1/1 Running 2 2d7h nginx-76df748b9-rdbqq 1/1 Running 2 2d7h $ ~ kubectl delete pod hazelcast-84bb5bb976-vrk97 pod "hazelcast-84bb5bb976-vrk97" deleted $ ~ kubectl get pods NAME READY STATUS RESTARTS AGE hazelcast-84bb5bb976-rpcfh 1/1 Running 0 13s nginx-76df748b9-rdbqq 1/1 Running 2 2d7h
The reason it spun off a new Pod is because Kubernetes observed that a state in the cluster was disturbed where the desired and observed states did not match, and hence the reconciliation loop kicked in to balance the Pods. This could happen because the Pod was actually deployed via a Deployment and would have ReplicaSets associated with it. So, in order to delete a Pod, you would need to delete the Deployment, which in turn would automatically delete the Pod. This is shown as follows:
$ ~ kubectl get deployments --all-namespaces NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE default hazelcast 1/1 1 1 2d21h default nginx 1/1 1 1 2d21h kube-system coredns 2/2 2 2 3d7h $ ~ kubectl delete -n default deployment hazelcast deployment.apps "hazelcast" deleted $ ~ kubectl get pods NAME READY STATUS RESTARTS AGE hazelcast-84bb5bb976-rpcfh 0/1 Terminating 0 21m nginx-76df748b9-rdbqq 1/1 Running 2 2d7h
Kubernetes in Production
Now that we have covered the basics of Pods, Kubernetes concepts, and how the Kubernetes cluster works, letâs take a look at how to actually tie together the Pods and become production ready in a Kubernetes cluster. In this section, we will see how the concepts discussed in the previous section empower production workloads and how applications are deployed onto Kubernetes.
ReplicaSets
As we discussed, ReplicaSets enable the self-healing capability of Kubernetes at the infrastructure level by maintaining a stable number of Pods. In the event of a failure at the infrastructure level (i.e., the nodes that hold the Pods), the ReplicaSet will reschedule the Pods to a different, healthy node. A ReplicaSet includes the following:
- Selector
- ReplicaSets use Pod labels to find and list the Pods running in the cluster to create replicas in case of a failure.
- Number of replicas to create
- This specifies how many Pods should be created.
- Template
- This specifies the associated data of new Pods that a ReplicaSet should create to meet the desired number of Pods.
In actual production use cases, you will need a ReplicaSet to maintain a stable set of running Pods by introducing redundancy. But you donât have to deal with the ReplicaSet directly, since itâs an abstraction layer. Instead, you should use Deployments, which offer a much better way to deploy and manage Pods.
A ReplicaSet looks like this:
apiVersion: apps/v1 kind: ReplicaSet metadata: name: myapp-cnf-replicaset labels: app: cnfbook spec: replicas: 3 selector: matchLabels: app: cnfbook template: metadata: labels: app: cnfbook spec: containers: - name: nginx image: nginx ports: - containerPort: 80
To create the ReplicaSet from the preceding configuration, save the manifest as nginx_Replicaset.yaml and apply it. This will create a ReplicaSet and three Pods, shown as follows:
$ kubectl apply -f nginx_Replicaset.yaml replicaset.apps/myapp-cnf-replicaset created $ kubectl get rs NAME DESIRED CURRENT READY AGE myapp-cnf-replicaset 3 3 3 6s $ kubectl get pods NAME READY STATUS RESTARTS AGE myapp-cnf-replicaset-drwp9 1/1 Running 0 11s myapp-cnf-replicaset-pgwj8 1/1 Running 0 11s myapp-cnf-replicaset-wqkll 1/1 Running 0 11s
Deployments
In production environments, a key thing that is constant is change. You will keep updating/changing the applications in your production environment at a very rapid pace, since the most common task you will be doing in production is rolling out new features of your application. Kubernetes offers the Deployment
object as a standard for doing rolling updates, and provides a seamless experience to the cluster administrator as well as end users. This means you can push updates to your applications without taking down your applications, since Deployment
ensures that only a certain number of Pods are down while they are being updated. These are known as zero downtime deployments. By default, Deployments ensure that at least 75% of the desired number of Pods are up. Deployments are the reliable, safe, and current way to roll out new versions of applications with zero downtime in Kubernetes.
You can create a Deployment
as follows:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx ports: - containerPort: 80
To apply the preceding configuration, save it in a file named nginx_Deployment.yaml and then use kubectl apply
as follows:
$ kubectl apply -f nginx_Deployment.yaml deployment.apps/nginx-deployment created
Interestingly, the Deployment handles ReplicaSets as well, since we declared that we would need three replicas for the deployment Pod. We can check this as follows:
$ kubectl get deployment NAME READY UP-TO-DATE AVAILABLE AGE nginx-deployment 3/3 3 3 18s $ kubectl get rs NAME DESIRED CURRENT READY AGE nginx-deployment-d46f5678b 3 3 3 29s $ kubectl get pods NAME READY STATUS RESTARTS AGE nginx-deployment-d46f5678b-dpc7p 1/1 Running 0 36s nginx-deployment-d46f5678b-kdjxv 1/1 Running 0 36s nginx-deployment-d46f5678b-kj8zz 1/1 Running 0 36s
So, from the manifest file, we created three replicas in the .spec.replicas file, and for the Deployment
object to find the Pods being managed by nginx-deployment
, we make use of the spec.selector
field. The Pods are labeled using the template field via metadata.labels
.
Now letâs roll out a new version of Nginx. Say we want to pin the version of Nginx to 1.14.2. We just need to edit the deployment manifest by editing the file; that is, changing the version and then saving the manifest file, as follows:
$ kubectl edit deployment.v1.apps/nginx-deployment deployment.apps/nginx-deployment edited
This will update the Deployment
object, and you can check it as follows:
$ kubectl rollout status deployment.v1.apps/nginx-deployment Waiting for deployment "nginx-deployment" rollout to finish: 1 out of 3 new replicas have been updated... Waiting for deployment "nginx-deployment" rollout to finish: 1 out of 3 new replicas have been updated... Waiting for deployment "nginx-deployment" rollout to finish: 2 out of 3 new replicas have been updated... Waiting for deployment "nginx-deployment" rollout to finish: 2 out of 3 new replicas have been updated... Waiting for deployment "nginx-deployment" rollout to finish: 2 out of 3 new replicas have been updated... Waiting for deployment "nginx-deployment" rollout to finish: 2 old replicas are pending termination... Waiting for deployment "nginx-deployment" rollout to finish: 1 old replicas are pending termination... Waiting for deployment "nginx-deployment" rollout to finish: 1 old replicas are pending termination... deployment "nginx-deployment" successfully rolled out
The Deployment
object ensures that a certain number of Pods are always available and serving while the older Pods are updated. As we already mentioned, by default, no more than 25% of Pods are unavailable while an update is being performed. Deployment ensures a max surge percentage of 25%, which ensures that only a certain number of Pods are created over the desired number of Pods. So, from the rollout status
, you can clearly see that at least two Pods are available at all times while rolling out a change. You can get the details for your deployment again by using kubectl describe deployment
.
Note
We directly issued a deployment command using kubectl edit
, but the preferred approach is to always update the actual manifest file and then apply kubectl
. This also helps you keep your deployment manifests in version control. You can use the --record
or set
command to update as well.
Horizontal Pod Autoscaler
Kubernetes supports dynamic scaling through the use of the Horizontal Pod Autoscaler (HPA), where the Pods are scaled horizontally; that is, you can create n number of Pods based on observed metrics for your Pod if, for example, you want to increase or decrease the number of Pods dynamically based on the CPU metric being observed. The HPA works via a control loop which, at an interval of 15 seconds by default, checks the resource utilization specified against the metrics (see Figure 4-4).
It is important to note that to use the HPA, we need metrics-server, which collects metrics from kubelets and exposes them in the Kubernetes API server through the metrics API for use by the HPA. We first create an autoscaler using kubectl autoscale
for our deployment, as follows:
$ ~ kubectl autoscale deployment nginx-deployment --cpu-percent=50 --min=3 --max=10 horizontalpodautoscaler.autoscaling/nginx-deployment autoscaled
The preceding kubectl command will create an HPA that will ensure that no fewer than three and no more than 10 Pods are used in our nginx-deployment
. The HPA will increase or decrease the number of replicas to maintain an average CPU utilization across all Pods of no more than 50%.
You can view your HPA using the following:
$ ~ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE nginx-deployment Deployment/nginx-deployment 0%/50% 3 10 3 6m59s
We can add the scaling factor on which we can create the autoscaler with the help of an HPA manifest YAML file that is tied to the deployment.
Service
As we mentioned earlier, the Kubernetes environment is a very dynamic system where Pods are created, destroyed, and moved at a varying pace. This dynamic environment also opens a door to a well-known problem: finding the replica Pods where an application is residing, since multiple Pods are running for a deployment. Pods also need a way to find the other Pods in order to communicate. Kubernetes offers the Service
object as an abstraction for a single point of entry to the group of Pods. The Service
object has an IP address, a DNS name, and a port that never changes as long as the object exists. Formally known as service discovery, this feature basically helps other Pods/services reach other services in Kubernetes without dealing with the underlying complexity. We will discuss the cloud native service discovery approach in more detail in Chapter 6.
To use a Service, you can use the service manifest YAML. Suppose you have a simple âhello worldâ application already running as a Deployment. One default technique to expose this service is to specify the service type
as ClusterIP
. This service will be exposed on the clusterâs internal IP and will only be reachable from within the cluster, as follows:
--- apiVersion: v1 kind: Service metadata: name: hello-world-service spec: type: ClusterIP selector: app: hello-world ports: - port: 8080 targetPort: 8080
The port
represents the port where the service will be available and the targetPort
is the actual container port where the service will be forwarded. In this case, we have exposed port 8080 of the hello-world app on target port 8080 against the Pod IP:
$ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hello-world-service ClusterIP 10.32.0.40 <none> 8080/TCP 6s
The IP address shown is the cluster IP and not the actual machine IP address. If you SSH on a worker node, you can check whether the service was exposed on port 8080 by simply doing a curl
as follows:
ubuntu@worker0:~$ curl http://10.32.0.40:8080 Hello World
If you try to reach this IP address from outside the cluster (i.e., any other node apart from workers), you wonât be able to connect to it. So, you can use NodePort
, which exposes a service on each nodeâs IP at a defined port as follows:
--- apiVersion: v1 kind: Service metadata: name: hello-world-node-service spec: type: NodePort selector: app: hello-world ports: - port: 8080 targetPort: 8080 nodePort: 30767
In the service manifest, we have mapped Pod port 8080 to NodePort
(i.e., physical instance port) 30767. In this way, you can expose the IP directly or place a load balancer of your choice. If you now do a get svc
, you can see the mapping of ports as follows:
$ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hello-world-node-service NodePort 10.32.0.208 <none> 8080:30767/TCP 8s hello-world-service ClusterIP 10.32.0.40 <none> 8080/TCP 41m
Now we can access the service on the node at port 30767:
ubuntu@controller0:~$ curl http://10.240.0.21:30767 Hello World
The IP in the curl
command is the worker nodeâs physical IP (not the cluster IP) address, and the port that is being exposed is 30767. You can even directly hit the public IP of the node for port 30767. Figure 4-5 showcases how the cluster IP, node port, and load balancer relate to one another.
Other types of services include LoadBalancer
and ExternalName
. LoadBalancer
exposes the Service externally using a cloud providerâs load balancer. NodePort
and ClusterIP
Services, to which the external load balancer routes, are automatically created, while ExternalName
maps a service to a DNS name like my.redisdb.internal.com
. In Figure 4-6, you can see how different service types are related to one another.
Ingress
The Service
object helps expose the application both inside and outside the cluster, but in production systems we cannot afford to keep opening new and unique ports for all the services we deploy using NodePort
, nor we can create a new load balancer every time we choose the service type to be LoadBalancer
. At times we need to deploy an HTTP-based service and also perform SSL offloading, and the Service
object doesnât really help us in those instances. In Kubernetes, HTTP load balancing (or, formally, Layer 7 load balancing) is performed by the Ingress
object.Â
To work with Ingress
, we first need to configure an ingress controller.6 We will configure an Azure Kubernetes Service (AKS) Application Gateway Ingress Controller to understand this better and see how it behaves, but in general, one of the easiest ways to understand it better is by looking at the following:
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ingress-wildcard-host spec: rules: - host: "foo.bar.com" http: paths: - pathType: Prefix path: "/bar" backend: service: name: hello-world-node-service port: number: 8080 - host: "*.foo.com" http: paths: - pathType: Prefix path: "/foo" backend: service: name: service2 port: number: 80
In the ingress manifest, we have created two rules and mapped foo.bar.com as a host to one subdomain /bar, which routes to our previous service hello-world-node-service
.
Similarly, we can have multiple paths defined for a domain and route them to other services. There are various ways to configure the ingress according to your needs, which might span from a single domain routing to multiple services or multiple domains routing to multiple domains (see Figure 4-7).
Lastly, you can specify TLS support by creating a Secret
object and then using that secret in your ingress spec as follows:
apiVersion: v1 kind: Secret metadata: name: my_tls_secret namespace: default data: tls.crt: base64 encoded cert tls.key: base64 encoded key type: kubernetes.io/tls
You can secure the ingress by specifying just the Base64-encoded TLS certificate and the key. When you reference this secret in your ingress manifest it will appear as follows:
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: tls-example-ingress spec: tls: - hosts: - https.mywebsite.example.come secretName: my_tls_secret rules: - host: https.mywebsite.example.come http: paths: - path: / pathType: Prefix backend: service: name: service1 port: number: 80
Ingress controllers deal with many features and complexities, and the Azure Gateway Ingress Controller handles a lot of them; for example, leveraging Azureâs native Layer 7 application gateway load balancer to expose your services to the internet. We will discuss this in more detail when we introduce AKS in Chapter 5.
DaemonSet
DaemonSets, as we discussed earlier, are typically used to run an agent across a number of nodes in the Kubernetes cluster. The agent is run inside a container abstracted by Pods. Most of the time, SREs and DevOps engineers prefer to run a log agent or a monitoring agent on each node to gain application telemetry and events. By default, a DaemonSet creates a copy of a Pod on every node, though this can be restricted as well using a node selector. There are a lot of similarities between ReplicaSets and DaemonSets, but the key distinction between them is the requirement (with DaemonSets) of running a single agent (i.e., Pod) application across all of your nodes.
One of the ways to run a logging container is by deploying a Pod in each node using a DaemonSet. Fluentd is an open source logging solution widely used to collect logs from systems. This example shows one of the ways to deploy Fluentd using a DaemonSet:
apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd-elasticsearch namespace: kube-system labels: central-log-k8s: fluentd-logging spec: selector: matchLabels: name: fluentd-elasticsearch template: metadata: labels: name: fluentd-elasticsearch spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - name: fluentd-elasticsearch image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2 resources: limits: memory: 200Mi requests: cpu: 100m memory: 200Mi volumeMounts: - name: varlog mountPath: /var/log - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers
In the preceding DaemonSet configuration, we created a DaemonSet that will deploy the Fluentd container on each node. We have also created a toleration, which does not schedule the Fluentd node on the master (control plan) nodes.
Note
Kubernetes offers scheduling features called taints and tolerations:
-
Taints in Kubernetes allow a node to repel a set of Pods (i.e., if you want certain nodes not to schedule some type of Pod).
-
Tolerations are applied to Pods, and allow (but do not require) the Pods to schedule onto nodes with matching taints.
Taints and tolerations work together to ensure that Pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this indicates that the node should not accept any Pods that do not tolerate the taints.
You can check the Pods that were created automatically for each worker node as follows:
$ kubectl apply -f fluentd.yaml daemonset.apps/fluentd-elasticsearch created $ kubectl get ds --all-namespaces NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE kube-system fluentd-elasticsearch 3 3 3 3 3 NODE SELECTOR AGE <none> 5m12s $ kubectl get pods --namespace=kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE fluentd-elasticsearch-5jxg7 1/1 Running 1 45m 10.200.1.53 worker1 fluentd-elasticsearch-c5s4c 1/1 Running 1 45m 10.200.2.32 worker2 fluentd-elasticsearch-r4pqz 1/1 Running 1 45m 10.200.0.43 worker0 NOMINATED NODE READINESS GATES <none> <none> <none> <none> <none> <none>
Jobs
Sometimes we need to run a small script until it can successfully terminate. Kubernetes allows you to do this with the Job
object. The Job
object creates and manages Pods that will run until successful completion, and unlike regular Pods, once the given task is completed, these Job
-created Pods are not restarted. You can use a Job
YAML to describe a simple Job
as follows:
apiVersion: batch/v1 kind: Job metadata: name: examplejob spec: template: metadata: name: examplejob spec: containers: - name: examplejob image: busybox command: ["echo", "Cloud Native with Azure"] restartPolicy: Never
Here we created a Job
to print a shell command. To create a Job
, you can save the preceding YAML and apply it using kubectl apply
. Once you apply the job manifest, Kubernetes will create the job and immediately run it. You can check the status of the job using kubectl describe
as follows:
$ kubectl apply -f job.yaml job.batch/examplejob created $ kubectl get jobs NAME COMPLETIONS DURATION AGE examplejob 1/1 3s 9s $ kubectl describe job examplejob Name: examplejob Namespace: default Selector: controller-uid=f6887706-85ef-4752-8911-79cc7ab33886 Labels: controller-uid=f6887706-85ef-4752-8911-79cc7ab33886 job-name=examplejob Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"batch/v1","kind":"Job","metadata": {"annotations":{},"name":"examplejob","namespace":"default"}, "spec":{"template":{"metadat... Parallelism: 1 Completions: 1 Start Time: Mon, 07 Sep 2020 01:08:53 +0530 Completed At: Mon, 07 Sep 2020 01:08:56 +0530 Duration: 3s Pods Statuses: 0 Running / 1 Succeeded / 0 Failed Pod Template: Labels: controller-uid=f6887706-85ef-4752-8911-79cc7ab33886 job-name=examplejob Containers: examplejob: Image: busybox Port: <none> Host Port: <none> Command: echo Cloud Native with Azure Environment: <none> Mounts: <none> Volumes: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 61s job-controller Created pod: examplejob-mcqzc Normal Completed 58s job-controller Job completed
Summary
Kubernetes is a powerful platform that was built from a decade of experience gained by containerizing applications at scale at Google. Kubernetes basically led to the inception of the Cloud Native Computing Foundation and was the first project to graduate under it. This led to a whole lot of streamlining of the microservices ecosystem in respect to support for and higher adoption of a cloud native environment. In this chapter, we looked at the various components and concepts that allow Kubernetes to operate at scale. This chapter also sets the stage for upcoming chapters in which we will utilize the Kubernetes platform to serve production-grade cloud native applications.
Given the underlying complexity of managing a Kubernetes cluster, in Chapter 5 we will look at how to create and use such a cluster. We will also look at the Azure Kubernetes Service, and more.
1 Managing Kubernetes by Brendan Burns and Craig Tracey (OâReilly, 2019); Kubernetes in Action by Marko LukÅ¡a (Manning, 2018).
2 Pods are the smallest deployable units of computing that you can create and manage in Kubernetes.
3 In Kubernetes, services are a way of exposing your Pods so that they can be discovered inside the Kubernetes cluster.
4 Fractional requests are allowed. For example, one CPU can be broken into two 0.5s. The expression 0.1 is equivalent to 100m.
5 Limit and request are measured in bytes. Memory can be expressed as a plain integer or a fixed-point number using the suffixes E, P, T, G, M, K, or their power-of-two equivalents Ei, Pi, Ti, Gi, Mi, Ki.
6 For the Ingress
resource to work, the cluster must have an ingress controller running. Unlike other types of controllers that run as part of the kube-controller-manager
binary, ingress controllers are not started automatically with a cluster. There are different types of ingress controllers; for example, Azure offers the AKS Application Gateway Ingress Controller for configuring the Azure Application Gateway.
Get Cloud Native Infrastructure with Azure now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.