Chapter 4. Automating Database Deployment on Kubernetes with Helm
In the previous chapter, you learned how to deploy both single-node and multinode databases on Kubernetes by hand, creating one element at a time. We did things the âhard wayâ on purpose to help maximize your understanding of using Kubernetes primitives to set up the compute, network, and storage resources that a database requires. Of course, this doesnât represent the experience of running databases in production on Kubernetes, for a couple of reasons.
First, teams typically donât deploy databases by hand, one YAML file at a time. That can get pretty tedious. And even combining the configurations into a single file could start to get pretty complicated, especially for more sophisticated deployments. Consider the increase in the amount of configuration required in Chapter 3 for Cassandra as a multinode database compared with the single-node MySQL deployment. This wonât scale for large enterprises.
Second, while deploying a database is great, what about keeping it running over time? You need your data infrastructure to remain reliable and performant over the long haul, and data infrastructure is known for requiring a lot of care and feeding. Put another way, the task of running a system is often divided into âday oneâ (the joyous day when you deploy an application to production) and âday twoâ (every day after the first, when you need to operate and evolve your application while maintaining high availability).
These considerations around database deployment and operations mirror the larger industry trends toward DevOps, an approach in which development teams take a more active role in supporting applications in production. DevOps practices include the use of automation tools for CI/CD of applications, shortening the amount of time it takes for code to get from a developerâs desktop into production.
In this chapter, weâll look at tools that help standardize the deployment of databases and other applications. These tools take an infrastructure as code (IaC) approach, allowing you to represent software installation and configuration options in a format that can be executed automatically, reducing the overall amount of configuration code you have to write. Weâll also emphasize data infrastructure operations in these next two chapters and carry that theme throughout the remainder of the book.
Deploying Applications with Helm Charts
Letâs start by taking a look at a tool that helps you manage the complexity of managing configurations: Helm. This package manager for Kubernetes is open source and a CNCF graduated project. The concept of a package manager is a common one across multiple programming languages, such as pip
for Python, the Node Package Manager (NPM) for JavaScript, and Rubyâs Gems feature. Package managers for specific operating systems also exist, such as Apt for Linux, or Homebrew for macOS. As shown in Figure 4-1, the essential elements of a package manager system are the packages, the registries where the packages are stored, and the package manager application (or client), which helps the chart developers register charts and allows chart users to locate, install, and update packages on their local systems.
Helm extends the package management concept to Kubernetes, with some interesting differences. If youâve worked with one of the package managers listed previously, youâll be familiar with the idea that a package consists of a binary (executable code) as well as metadata describing the binary, such as its functionality, API, and installation instructions. In Helm, the packages are called charts. Charts describe how to build a Kubernetes application piece by piece by using the Kubernetes resources for compute, networking, and storage introduced in previous chapters, such as Pods, Services, and PersistentVolumeClaims. For compute workloads, the descriptions point to container images that reside in public or private container registries.
Helm allows charts to reference other charts as dependencies, which provides a great way to compose applications by creating assemblies of charts. For example, you could define an application such as the WordPress/MySQL example from the previous chapter by defining a chart for your WordPress deployment that referenced a chart defining a MySQL deployment that you wish to reuse. Or, you might even find a Helm chart that defines an entire WordPress application including the database.
Kubernetes Environment Prerequisites
The examples in this chapter assume you have access to a Kubernetes cluster with a couple of characteristics:
The cluster should have at least three Worker Nodes, in order to demonstrate mechanisms Kubernetes provides to allow you to request Pods to be spread across a cluster. You can create a simple cluster on your desktop by using an open source distribution called kind. See the kind quick start guide for instructions on installing kind and creating a multinode cluster. The code for this example also contains a configuration file you may find useful to create a simple three-node kind cluster.
You will also need a StorageClass that supports dynamic provisioning. You may wish to follow the instructions in âStorageClassesâ for installing a simple StorageClass and provisioner that expose local storage.
Using Helm to Deploy MySQL
To make things a bit more concrete, letâs use Helm to deploy the databases you worked with in Chapter 3. First, if itâs not already on your system, youâll need to install Helm by using the documentation on the Helm website. Next, add the Bitnami Helm repository:
helm repo add bitnami https://charts.bitnami.com/bitnami
The Bitnami Helm repository contains a variety of Helm charts to help you deploy infrastructure such as databases, analytics engines, and log management systems, as well as applications including ecommerce, customer relationship management (CRM), and you guessed it: WordPress. You can find the source code for the charts in the Bitnami Charts repository on GitHub. The README for this repo provides helpful instructions for using the charts in various Kubernetes distributions.
Now, letâs use the Helm chart provided in the bitnami
repository to deploy MySQL. In Helmâs terminology, each deployment is known as a release. The simplest possible release that you could create using this chart would look something like this:
# donât execute me yet! helm install mysql bitnami/mysql
If you execute this command, it will create a release called mysql
using the Bitnami MySQL Helm chart with its default settings. As a result, youâd have a single MySQL node. Since youâve already deployed a single node of MySQL manually in Chapter 3, letâs do something a bit more interesting this time and create a MySQL cluster. To do this, youâll create a values.yaml file with contents like the following, or you can reuse the sample provided in the source code:
architecture: replication secondary: replicaCount: 2
The settings in this values.yaml file let Helm know that you want to use options in the Bitnami MySQL Helm chart to deploy MySQL in a replicated architecture in which there is a primary node and two secondary nodes.
MySQL Helm Chart Configuration Options
If you examine the default values.yaml file provided with the Bitnami MySQL Helm chart, youâll see quite a few options available beyond the simple selections shown here. The configurable values include the following:
Images to pull and their locations
The Kubernetes StorageClass that will be used to generate PersistentVolumes
Security credentials for user and administrator accounts
MySQL configuration settings for primary and secondary replicas
Number of secondary replicas to create
Details of liveness, readiness probes
Affinity and anti-affinity settings
Managing high availability of the database using Pod disruption budgets
Many of these concepts youâll be familiar with already, and others like affinity and Pod disruption budgets are covered later in the book.
Once youâve created the values.yaml file, you can start the cluster using this command:
helm install mysql bitnami/mysql -f values.yaml
After running the command, youâll see the status of the install from Helm, plus instructions that are provided with the chart under NOTES
:
NAME: mysql LAST DEPLOYED: Thu Oct 21 20:39:19 2021 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: â¦
Weâve omitted the notes here since they are a bit lengthy. They describe suggested commands for monitoring the status as MySQL initializes, how clients and administrators can connect to the database, how to upgrade the database, and more.
Use Namespaces to Help Isolate Resources
Since we did not specify a Namespace, the Helm release has been installed in the default Kubernetes Namespace unless youâve separately configured a Namespace in your kubeconfig. If you want to install a Helm release in its own Namespace in order to work with its resources more effectively, you could run something like the following:
helm install mysql bitnami/mysql \ --namespace mysql --create-namespace
This creates a Namespace called mysql
and installs the mysql
release inside it.
To obtain information about the Helm releases youâve created, use the helm list
command, which produces output such as this (formatted for readability):
helm list NAME NAMESPACE REVISION UPDATED mysql default 1 2021-10-21 20:39:19 STATUS CHART APP VERSION deployed mysql-8.8.8 8.0.26
If you havenât installed the release in its own Namespace, itâs still simple to see the compute resources that Helm has created on your behalf by running kubectl get all
, because they have all been labeled with the name of your release. It may take several minutes for all the resources to initialize, but when complete, it will look something like this:
kubectl get all NAME READY STATUS RESTARTS AGE pod/mysql-primary-0 1/1 Running 0 3h40m pod/mysql-secondary-0 1/1 Running 0 3h40m pod/mysql-secondary-1 1/1 Running 0 3h38m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT service/mysql-primary ClusterIP 10.96.107.156 <none> ... service/mysql-primary-headless ClusterIP None <none> ... service/mysql-secondary ClusterIP 10.96.250.52 <none> ... service/mysql-secondary-headless ClusterIP None <none> ... NAME READY AGE statefulset.apps/mysql-primary 1/1 3h40m statefulset.apps/mysql-secondary 2/2 3h40m
As you can see, Helm has created two StatefulSets, one for primary replicas and one for secondary replicas. The mysql-primary
StatefulSet is managing a single MySQL Pod containing a primary replica, while the mysql-secondary
StatefulSet is managing two MySQL Pods containing secondary replicas. See if you can determine which Kubernetes Worker Node each MySQL replica is running on by using the kubectl describe pod
command.
From the preceding output, youâll also notice two Services created for each StatefulSet, one a headless service and another that has a dedicated IP address. Since kubectl get all
tells you about only compute resources and services, you might also be wondering about the storage resources. To check on these, run the kubectl get pv
command. Assuming you have a StorageClass installed that supports dynamic provisioning, you should see PersistentVolumes that are bound to PersistentVolumeClaims named data-mysql-primary-0
, data-mysql-secondary-0
, and data-mysql-secondary-1
.
In addition to the resources weâve discussed, installing the chart has also resulted in the creation of a few additional resources that weâll explore next.
Namespaces and Kubernetes Resource Scope
If you have chosen to install your Helm release in a Namespace, youâll need to specify the Namespace on most of your kubectl get
commands in order to see the created resources. The exception is kubectl get pv
, because PersistentVolumes are one of the Kubernetes resources that are not Namespaced; that is, they can be used by Pods in any Namespace. To learn more about which Kubernetes resources in your cluster are Namespaced and which are not, run the command kubectl api-resources
.
How Helm Works
Did you wonder what happened when you executed the helm install
command with a provided values file? To understand whatâs going on, letâs take a look at the contents of a Helm chart, as shown in Figure 4-2. As we discuss these contents, it will also be helpful to look at the source code of the MySQL Helm chart you just installed.
Looking at the contents of a Helm chart, youâll notice the following:
- README file
- This explains how to use the chart. These instructions are provided along with the chart in registries.
- Chart.yaml file
- This contains metadata about the chart such as its name, publisher, version, keywords, and any dependencies on other charts. These properties are useful when searching Helm registries to find charts.
- values.yaml file
- This lists out the configurable values supported by the chart and their default values. These files typically contain a good number of comments that explain the available options. For the Bitnami MySQL Helm chart, a lot of options are available, as weâve noted.
- templates directory
- This contains Go templates that define the chart. The templates include a Notes.txt file used to generate the output you saw previously after executing the
helm install
command, and one or more YAML files that describe a pattern for a Kubernetes resource. These YAML files may be organized in subdirectories (for example, the template that defines a StatefulSet for MySQL primary replicas). Finally, a _helpers.tpl file describes how to use the templates. Some of the templates may be used multiple times or not at all, depending on the selected configuration values.
When you execute the helm install
command, the Helm client makes sure it has an up-to-date copy of the chart youâve named by checking with the source repository. Then it uses the template to generate YAML configuration code, overriding default values from the chartâs values.yaml file with any values youâve provided. It then uses the kubectl
command to apply this configuration to your currently configured Kubernetes cluster.
If youâd like to see the configuration that a Helm chart will produce before applying it, you can use the handy template
command. It supports the same syntax as the install
command:
helm template mysql bitnami/mysql -f values.yaml
Running this command will produce quite a bit of output, so you may want to redirect it to a file (append > values-template.yaml
to the command) so you can take a longer look. Alternatively, you can look at the copy we have saved in the source code repository.
Youâll notice that several types of resources are created, as summarized in Figure 4-3. Many of the resources shown have been discussed, including the StatefulSets for managing the primary and secondary replicas, each with its own service (the chart also creates headless services that are not shown in the figure). Each Pod has its own PersistentVolumeClaim that is mapped to a unique PersistentVolume.
Figure 4-3 also includes resource types we havenât discussed previously. Notice first that each StatefulSet has an associated ConfigMap that is used to provide a common set of configuration settings to its Pods. Next, notice the Secret named mysql
, which stores passwords needed for accessing various interfaces exposed by the database nodes. Finally, a ServiceAccount resource is applied to every Pod created by this Helm release.
Letâs focus on some interesting aspects of this deployment, including the usage of labels, ServiceAccounts, Secrets, and ConfigMaps.
Labels
If you look through the output from the helm template
, youâll notice that the resources have a common set of labels:
labels: app.kubernetes.io/name: mysql helm.sh/chart: mysql-8.8.8 app.kubernetes.io/instance: mysql app.kubernetes.io/managed-by: Helm
These labels help identify the resources as being part of the mysql
application and indicate that they are managed by Helm using a specific chart version. The labels are useful for selecting resources, which is often useful in defining configurations for other resources.
ServiceAccounts
Kubernetes clusters make a distinction between human users and applications for access control purposes. A ServiceAccount is a Kubernetes resource that represents an application and what it is allowed to access. For example, a ServiceAccount may be given access to some portions of the Kubernetes API, or access to one or more secrets containing privileged information such as login credentials. This latter capability is used in your Helm installation of MySQL to share credentials between Pods.
Every Pod created in Kubernetes has a ServiceAccount assigned to it. If you do not specify one, the default ServiceAccount is used. Installing the MySQL Helm chart creates a ServiceAccount called mysql
. You can see the specification for this resource in the generated template:
apiVersion: v1 kind: ServiceAccount metadata: name: mysql namespace: default labels: ... annotations: secrets: - name: mysql
As you can see, this ServiceAccount has access to a Secret called mysql
, which weâll discuss shortly. A ServiceAccount can also have an additional type of Secret known as an imagePullSecret
. These Secrets are used when an application needs to use images from a private registry.
By default, a ServiceAccount does not have any access to the Kubernetes API. To give this ServiceAccount the access it needs, the MySQL Helm chart creates a Role specifying the Kubernetes resources and operations, and a RoleBinding to associate the ServiceAccount to the Role. Weâll discuss ServiceAccounts and role-based access in Chapter 5.
Secrets
As you learned in Chapter 2, a Secret provides secure access to information you need to keep private. Your mysql
Helm release contains a Secret called mysql
containing login credentials for the MySQL instances themselves:
apiVersion: v1 kind: Secret metadata: name: mysql namespace: default labels: ... type: Opaque data: mysql-root-password: "VzhyNEhIcmdTTQ==" mysql-password: "R2ZtNkFHNDhpOQ==" mysql-replication-password: "bDBiTWVzVmVORA=="
The three passwords represent different types of access: the mysql-root-password
provides administrative access to the MySQL node, while the mysql-replication-password
is used for nodes to communicate for the purposes of data replication between nodes. The mysql-password
is used by client applications to access the database to write and read data.
ConfigMaps
The Bitnami MySQL Helm chart creates Kubernetes ConfigMap resources to represent the configuration settings used for Pods that run the MySQL primary and secondary replica nodes. ConfigMaps store configuration data as key-value pairs. For example, the ConfigMap created by the Helm chart for the primary replicas looks like this:
apiVersion: v1 kind: ConfigMap metadata: name: mysql-primary namespace: default labels: ... data: my.cnf: |- [mysqld] default_authentication_plugin=mysql_native_password ...
In this case, the key is the name my.cnf
, which represents a filename, and the value is a multiline set of configuration settings that represent the contents of a configuration file (which weâve abbreviated here). Next, look at the definition of the StatefulSet for the primary replicas. Notice that the contents of the ConfigMap are mounted as a read-only file inside each template, according to the Pod specification for the StatefulSet (again, weâve omitted some detail to focus on key areas):
apiVersion: apps/v1 kind: StatefulSet metadata: name: mysql-primary namespace: default labels: ... spec: replicas: 1 selector: matchLabels: ... serviceName: mysql-primary template: metadata: annotations: ... labels: ... spec: ... serviceAccountName: mysql containers: - name: mysql image: docker.io/bitnami/mysql:8.0.26-debian-10-r60 volumeMounts: - name: data mountPath: /bitnami/mysql - name: config mountPath: /opt/bitnami/mysql/conf/my.cnf subPath: my.cnf volumes: - name: config configMap: name: mysql-primary
Mounting the ConfigMap as a volume in a container results in the creation of a read-only file in the mount directory that is named according to the key and has the value as its content. For our example, mounting the ConfigMap in the Podâs mysql
container results in the creation of the file /opt/bitnami/mysql/conf/my.cnf.
This is one of several ways that ConfigMaps can be used in Kubernetes applications:
As described in the Kubernetes documentation, you could choose to store configuration data in more granular key-value pairs, which also makes it easier to access individual values in your application.
You can also reference individual key-value pairs as environment variables you pass to a container.
Finally, applications can access ConfigMap contents via the Kubernetes API.
More Configuration Options
Now that you have a Helm release with a working MySQL cluster, you can point an application to it, such as WordPress. Why not try seeing if you can adapt the WordPress deployment from Chapter 3 to point to the MySQL cluster youâve created here?
For further learning, you could also compare your resulting configuration with that produced by the Bitnami WordPress Helm chart, which uses MariaDB instead of MySQL but is otherwise quite similar.
Updating Helm Charts
If youâre running a Helm release in a production environment, chances are youâre going to need to maintain it over time. You might want to update a Helm release for various reasons:
A new version of a chart is available.
A new version of an image used by your application is available.
You want to change the selected options.
To check for a new version of a chart, execute the helm repo update
command. Running this command with no options looks for updates in all of the chart repositories you have configured for your Helm client:
helm repo update Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "bitnami" chart repository Update Complete. âHappy Helming!â
Next, youâll want to make any desired updates to your configured values. If youâre upgrading to a new version of a chart, make sure to check the release notes and documentation of the configurable values. Itâs a good idea to test out an upgrade before applying it. The --dry-run
option allows you to do this, producing similar values to the helm template
command:
helm upgrade mysql bitnami/mysql -f values.yaml --dry-run
Using an Overlay Configuration File
One useful option you could use for the upgrade is to specify values you wish to override in a new configuration file, and apply both the new and old, something like this:
helm upgrade mysql bitnami/mysql \ -f values.yaml -f new-values.yaml
Configuration files are applied in the order they appear on the command line, so if you use this approach, make sure your overridden values file appears after your original values file.
Once youâve applied the upgrade, Helm sets about its work, updating only those resources in the release that are affected by your configuration changes. If youâve specified changes to the Pod template for a StatefulSet, the Pods will be restarted according to the update policy specified for the StatefulSet, as we discussed in âStatefulSet lifecycle managementâ.
Uninstalling Helm Charts
When you are finished using your Helm release, you can uninstall it by name:
helm uninstall mysql
Note that Helm does not remove any of the PersistentVolumeClaims or PersistentVolumes that were created for this Helm chart, following the behavior of StatefulSets discussed in Chapter 3.
Using Helm to Deploy Apache Cassandra
Now letâs switch gears and look at deploying Apache Cassandra by using Helm. In this section, youâll use another chart provided by Bitnami, so thereâs no need to add another repository. You can find the implementation of this chart on GitHub. Helm provides a quick way to see the metadata about this chart:
helm show chart bitnami/cassandra
After reviewing the metadata, youâll also want to learn about the configurable values. You can examine the values.yaml file in the GitHub repo, or use another option on the show
command:
helm show values bitnami/cassandra
The list of options for this chart is shorter than the list for the MySQL chart, because Cassandra doesnât have the concept of primary and secondary replicas. However, youâll certainly see similar options for images, StorageClasses, security, liveness and readiness probes, and so on. Some configuration options are unique to Cassandra, such as those having to do with JVM settings and seed nodes (as discussed in Chapter 3).
One interesting feature of this chart is the ability to export metrics from Cassandra nodes. If you set metrics.enabled=true
, the chart will inject a sidecar container into each Cassandra Pod that exposes a port that can be scraped by Prometheus. Other values under metrics
configure what metrics are exported, the collection frequency, and more. While we wonât use this feature here, metrics reporting is a key part of managing data infrastructure weâll cover in Chapter 6.
For a simple three-node Cassandra configuration, you could set the replica count to 3 and set other configuration values to their defaults. However, since youâre overriding only a single configuration value, this is a good time to take advantage of Helmâs support for setting values on the command line, instead of providing a values.yaml file:
helm install cassandra bitnami/cassandra --set replicaCount=3
As discussed previously, you can use the helm template
command to check the configuration before installing it, or look at the file weâve saved on GitHub. However, since youâve already created the release, you can also use this command:
helm get manifest cassandra
Looking through the resources in the YAML, youâll see that a similar set of infrastructure has been established, as shown in Figure 4-4.
The configuration includes the following:
A ServiceAccount referencing a Secret, which contains the password for the
cassandra
administrator account.A single StatefulSet, with a headless Service used to reference its Pods. The Pods are spread evenly across the available Kubernetes Worker Nodes, which weâll discuss in the next section. The Service exposes Cassandra ports used for intra-node communication (
7000
, with7001
used for secure communication via TLS), administration via JMX (7199
), and client access via CQL (9042
).
This configuration represents a simple Cassandra topology, with all three nodes in a single Datacenter and rack. This simple topology reflects one of the limitations of this chartâit does not provide the ability to create a Cassandra cluster consisting of multiple Datacenters and racks. To create a more complex deployment, youâd have to install multiple Helm releases, using the same clusterName
(in this case, youâre using the default name cassandra
), but a different Datacenter and rack per deployment. Youâd also need to obtain the IP address of a couple of nodes in the first Datacenter to use as additionalSeeds
when configuring the releases for the other racks.
Affinity and Anti-Affinity
As shown in Figure 4-4, the Cassandra nodes are spread evenly across the Worker Nodes in your cluster. To verify this in your own Cassandra release, you could run something like the following:
kubectl describe pods | grep "^Name:" -A 3 Name: cassandra-0 Namespace: default Priority: 0 Node: kind-worker/172.20.0.7 -- Name: cassandra-1 Namespace: default Priority: 0 Node: kind-worker2/172.20.0.6 -- Name: cassandra-2 Namespace: default Priority: 0 Node: kind-worker3/172.20.0.5
As you can see, each Cassandra node is running on a different Worker Node. If your Kubernetes cluster has at least three Worker Nodes and no other workloads, youâll likely observe similar behavior. While it is true that this even allocation could happen naturally in a cluster that has an even load across Worker Nodes, this is probably not the case in your production environment. However, to promote maximum availability of your data, we want to try to honor the intent of Cassandraâs architecture to run nodes on different machines in order to promote high availability.
To help guarantee this isolation, the Bitnami Helm chart uses Kubernetesâs affinity capabilities, specifically anti-affinity. If you examine the generated configuration for the Cassandra StatefulSet, youâll see the following:
apiVersion: apps/v1 kind: StatefulSet metadata: name: cassandra namespace: default labels: ... spec: ... template: metadata: labels: ... spec: ... affinity: podAffinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: labelSelector: matchLabels: app.kubernetes.io/name: cassandra app.kubernetes.io/instance: cassandra namespaces: - "default" topologyKey: kubernetes.io/hostname weight: 1 nodeAffinity:
As shown here, the Pod template specification lists three possible types of affinity, with only the podAntiAffinity
being defined. What do these concepts mean?
- Pod affinity
- The preference that a Pod is scheduled onto a node where another specific Pod is running. For example, Pod affinity could be used to colocate a web server with its cache.
- Pod anti-affinity
- The opposite of Pod affinityâthat is, a preference that a Pod not be scheduled on a node where another identified Pod is running. This is the constraint used in this example, as weâll discuss shortly.
- Node affinity
- A preference that a Pod be run on a node with specific characteristics.
Each type of affinity can be expressed as either hard or soft constraints. These are known as requiredDuringSchedulingIgnoredDuringExecution
and preferredDuringSchedulingIgnoredDuringExecution
. The first constraint specifies rules that must be met before a Pod is scheduled on a node, while the second specifies a preference that the scheduler will attempt to meet but may relax if necessary in order to schedule the Pod.
IgnoredDuringExcecution
implies that the constraints apply only when the Pods are first scheduled. In the future, new RequiredDuringExecution
options will be added called requiredDuringSchedulingRequiredDuringExecution
and requiredDuringSchedulingRequiredDuringExecution
. These will ask Kubernetes to evict Pods (that is, move them to another node) that no longer meet the criteriaâfor example, by a change in their labels.
Looking at the preceding example, the Pod template specification for the Cassandra StatefulSet specifies an anti-affinity rule using the labels that are applied to each Cassandra Pod. The net effect is that Kubernetes will try to spread the Pods across the available Worker Nodes.
Those are the highlights of looking at the Bitnami Helm chart for Cassandra. To clean things up, uninstall the Cassandra release:
helm uninstall cassandra
If you donât want to work with Bitnami Helm charts any longer, you can also remove the repository from your Helm client:
helm repo remove bitnami
More Kubernetes Scheduling Constraints
Kubernetes supports additional mechanisms for providing hints to its scheduler about Pod placement. One of the simplest is NodeSelectors, which is very similar to node affinity, but with a less expressive syntax that can match on one or more labels by using AND logic. Since you may or may not have the required privileges to attach labels to Worker Nodes in your cluster, Pod affinity is often a better option. Taints and tolerations are another mechanism that can be used to configure Worker Nodes to repel specific Pods from being scheduled on those nodes.
In general, you want to be careful to understand all of the constraints youâre putting on the Kubernetes scheduler from various workloads so as not to overly constrain its ability to place Pods. See the Kubernetes documentation for more information on scheduling constraints. Weâll also look at how Kubernetes allows you to plug in different schedulers in âAlternative Schedulers for Kubernetesâ.
Helm, CI/CD, and Operations
Helm is a powerful tool focused on one primary task: deploying complex applications to Kubernetes clusters. To get the most benefit from Helm, youâll want to consider how it fits into your larger CI/CD toolset:
Automation servers such as Jenkins automatically build, test, and deploy software according to scripts known as jobs. These jobs are typically run based on predefined triggers, such as a commit to a source repository. Helm charts can be referenced in jobs to install an application under test and its supporting infrastructure in a Kubernetes cluster.
IaC automation tools such as Terraform allow you to define templates and scripts that describe how to create infrastructure in a variety of cloud environments. For example, you could write a Terraform script that automates the creation of a new VPC within a specific cloud provider and the creation of a new Kubernetes cluster within that VPC. The script could then use Helm to install applications within the Kubernetes cluster.
While overlaps certainly occur in the capabilities these tools provide, youâll want to consider the strengths and limitations of each as you construct your toolset. For this reason, we want to make sure to note that Helm has limitations when it comes to managing the operations of applications that it deploys. To get a good picture of the challenges involved, we spoke to a practitioner who has built assemblies of Helm charts to manage a complex database deployment. This discussion begins to introduce concepts like Kubernetes Custom Resource Definitions (CRDs) and the operator pattern, both of which weâll cover in depth in Chapter 5.
As John Sanda notes in his commentary, Helm is a powerful tool for scripting the deployment of applications consisting of multiple Kubernetes resources, but can be less effective at managing more complex operational tasks. As youâll see in the chapters to come, a common pattern used for data infrastructure and other complex applications is to use a Helm chart to deploy an operator, which can then in turn manage both the deployment and lifecycle of the application.
Summary
In this chapter, youâve learned how a package management tool like Helm can help you manage the deployment of applications on Kubernetes, including your database infrastructure. Along the way, youâve also learned how to use some additional Kubernetes resources like ServiceAccounts, Secrets, and ConfigMaps. Now itâs time to round out our discussion of running databases on Kubernetes. In the next chapter, weâll take a deeper dive into managing database operations on Kubernetes by using the operator pattern.
Get Managing Cloud Native Data on Kubernetes now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.