Chapter 4. Automating Database Deployment on Kubernetes with Helm

In the previous chapter, you learned how to deploy both single-node and multinode databases on Kubernetes by hand, creating one element at a time. We did things the “hard way” on purpose to help maximize your understanding of using Kubernetes primitives to set up the compute, network, and storage resources that a database requires. Of course, this doesn’t represent the experience of running databases in production on Kubernetes, for a couple of reasons.

First, teams typically don’t deploy databases by hand, one YAML file at a time. That can get pretty tedious. And even combining the configurations into a single file could start to get pretty complicated, especially for more sophisticated deployments. Consider the increase in the amount of configuration required in Chapter 3 for Cassandra as a multinode database compared with the single-node MySQL deployment. This won’t scale for large enterprises.

Second, while deploying a database is great, what about keeping it running over time? You need your data infrastructure to remain reliable and performant over the long haul, and data infrastructure is known for requiring a lot of care and feeding. Put another way, the task of running a system is often divided into “day one” (the joyous day when you deploy an application to production) and “day two” (every day after the first, when you need to operate and evolve your application while maintaining high availability).

These considerations around database deployment and operations mirror the larger industry trends toward DevOps, an approach in which development teams take a more active role in supporting applications in production. DevOps practices include the use of automation tools for CI/CD of applications, shortening the amount of time it takes for code to get from a developer’s desktop into production.

In this chapter, we’ll look at tools that help standardize the deployment of databases and other applications. These tools take an infrastructure as code (IaC) approach, allowing you to represent software installation and configuration options in a format that can be executed automatically, reducing the overall amount of configuration code you have to write. We’ll also emphasize data infrastructure operations in these next two chapters and carry that theme throughout the remainder of the book.

Deploying Applications with Helm Charts

Let’s start by taking a look at a tool that helps you manage the complexity of managing configurations: Helm. This package manager for Kubernetes is open source and a CNCF graduated project. The concept of a package manager is a common one across multiple programming languages, such as pip for Python, the Node Package Manager (NPM) for JavaScript, and Ruby’s Gems feature. Package managers for specific operating systems also exist, such as Apt for Linux, or Homebrew for macOS. As shown in Figure 4-1, the essential elements of a package manager system are the packages, the registries where the packages are stored, and the package manager application (or client), which helps the chart developers register charts and allows chart users to locate, install, and update packages on their local systems.

Helm, a package manager for Kubernetes
Figure 4-1. Helm, a package manager for Kubernetes

Helm extends the package management concept to Kubernetes, with some interesting differences. If you’ve worked with one of the package managers listed previously, you’ll be familiar with the idea that a package consists of a binary (executable code) as well as metadata describing the binary, such as its functionality, API, and installation instructions. In Helm, the packages are called charts. Charts describe how to build a Kubernetes application piece by piece by using the Kubernetes resources for compute, networking, and storage introduced in previous chapters, such as Pods, Services, and PersistentVolumeClaims. For compute workloads, the descriptions point to container images that reside in public or private container registries.

Helm allows charts to reference other charts as dependencies, which provides a great way to compose applications by creating assemblies of charts. For example, you could define an application such as the WordPress/MySQL example from the previous chapter by defining a chart for your WordPress deployment that referenced a chart defining a MySQL deployment that you wish to reuse. Or, you might even find a Helm chart that defines an entire WordPress application including the database.

Kubernetes Environment Prerequisites

The examples in this chapter assume you have access to a Kubernetes cluster with a couple of characteristics:

  • The cluster should have at least three Worker Nodes, in order to demonstrate mechanisms Kubernetes provides to allow you to request Pods to be spread across a cluster. You can create a simple cluster on your desktop by using an open source distribution called kind. See the kind quick start guide for instructions on installing kind and creating a multinode cluster. The code for this example also contains a configuration file you may find useful to create a simple three-node kind cluster.

  • You will also need a StorageClass that supports dynamic provisioning. You may wish to follow the instructions in “StorageClasses” for installing a simple StorageClass and provisioner that expose local storage.

Using Helm to Deploy MySQL

To make things a bit more concrete, let’s use Helm to deploy the databases you worked with in Chapter 3. First, if it’s not already on your system, you’ll need to install Helm by using the documentation on the Helm website. Next, add the Bitnami Helm repository:

helm repo add bitnami https://charts.bitnami.com/bitnami

The Bitnami Helm repository contains a variety of Helm charts to help you deploy infrastructure such as databases, analytics engines, and log management systems, as well as applications including ecommerce, customer relationship management (CRM), and you guessed it: WordPress. You can find the source code for the charts in the Bitnami Charts repository on GitHub. The README for this repo provides helpful instructions for using the charts in various Kubernetes distributions.

Now, let’s use the Helm chart provided in the bitnami repository to deploy MySQL. In Helm’s terminology, each deployment is known as a release. The simplest possible release that you could create using this chart would look something like this:

# don’t execute me yet!
helm install mysql bitnami/mysql

If you execute this command, it will create a release called mysql using the Bitnami MySQL Helm chart with its default settings. As a result, you’d have a single MySQL node. Since you’ve already deployed a single node of MySQL manually in Chapter 3, let’s do something a bit more interesting this time and create a MySQL cluster. To do this, you’ll create a values.yaml file with contents like the following, or you can reuse the sample provided in the source code:

architecture: replication
secondary:
  replicaCount: 2

The settings in this values.yaml file let Helm know that you want to use options in the Bitnami MySQL Helm chart to deploy MySQL in a replicated architecture in which there is a primary node and two secondary nodes.

MySQL Helm Chart Configuration Options

If you examine the default values.yaml file provided with the Bitnami MySQL Helm chart, you’ll see quite a few options available beyond the simple selections shown here. The configurable values include the following:

  • Images to pull and their locations

  • The Kubernetes StorageClass that will be used to generate PersistentVolumes

  • Security credentials for user and administrator accounts

  • MySQL configuration settings for primary and secondary replicas

  • Number of secondary replicas to create

  • Details of liveness, readiness probes

  • Affinity and anti-affinity settings

  • Managing high availability of the database using Pod disruption budgets

Many of these concepts you’ll be familiar with already, and others like affinity and Pod disruption budgets are covered later in the book.

Once you’ve created the values.yaml file, you can start the cluster using this command:

helm install mysql bitnami/mysql -f values.yaml

After running the command, you’ll see the status of the install from Helm, plus instructions that are provided with the chart under NOTES:

NAME: mysql
LAST DEPLOYED: Thu Oct 21 20:39:19 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
…

We’ve omitted the notes here since they are a bit lengthy. They describe suggested commands for monitoring the status as MySQL initializes, how clients and administrators can connect to the database, how to upgrade the database, and more.

Use Namespaces to Help Isolate Resources

Since we did not specify a Namespace, the Helm release has been installed in the default Kubernetes Namespace unless you’ve separately configured a Namespace in your kubeconfig. If you want to install a Helm release in its own Namespace in order to work with its resources more effectively, you could run something like the following:

helm install mysql bitnami/mysql \
  --namespace mysql --create-namespace

This creates a Namespace called mysql and installs the mysql release inside it.

To obtain information about the Helm releases you’ve created, use the helm list command, which produces output such as this (formatted for readability):

helm list
NAME   NAMESPACE  REVISION  UPDATED   
mysql  default    1         2021-10-21 20:39:19

STATUS    CHART        APP VERSION
deployed  mysql-8.8.8  8.0.26

If you haven’t installed the release in its own Namespace, it’s still simple to see the compute resources that Helm has created on your behalf by running kubectl get all, because they have all been labeled with the name of your release. It may take several minutes for all the resources to initialize, but when complete, it will look something like this:

kubectl get all
NAME                    READY   STATUS    RESTARTS   AGE
pod/mysql-primary-0     1/1     Running   0          3h40m
pod/mysql-secondary-0   1/1     Running   0          3h40m
pod/mysql-secondary-1   1/1     Running   0          3h38m

NAME                              TYPE       CLUSTER-IP     EXTERNAL-IP  PORT     
service/mysql-primary             ClusterIP  10.96.107.156  <none>       ...  
service/mysql-primary-headless    ClusterIP  None           <none>       ...  
service/mysql-secondary           ClusterIP  10.96.250.52   <none>       ... 
service/mysql-secondary-headless  ClusterIP  None           <none>       ...  

NAME                               READY   AGE
statefulset.apps/mysql-primary     1/1     3h40m
statefulset.apps/mysql-secondary   2/2     3h40m

As you can see, Helm has created two StatefulSets, one for primary replicas and one for secondary replicas. The mysql-primary StatefulSet is managing a single MySQL Pod containing a primary replica, while the mysql-secondary StatefulSet is managing two MySQL Pods containing secondary replicas. See if you can determine which Kubernetes Worker Node each MySQL replica is running on by using the kubectl describe pod command.

From the preceding output, you’ll also notice two Services created for each StatefulSet, one a headless service and another that has a dedicated IP address. Since kubectl get all tells you about only compute resources and services, you might also be wondering about the storage resources. To check on these, run the kubectl get pv command. Assuming you have a StorageClass installed that supports dynamic provisioning, you should see PersistentVolumes that are bound to PersistentVolumeClaims named data-mysql-primary-0, data-mysql-secondary-0, and data-mysql-secondary-1.

In addition to the resources we’ve discussed, installing the chart has also resulted in the creation of a few additional resources that we’ll explore next.

Namespaces and Kubernetes Resource Scope

If you have chosen to install your Helm release in a Namespace, you’ll need to specify the Namespace on most of your kubectl get commands in order to see the created resources. The exception is kubectl get pv, because PersistentVolumes are one of the Kubernetes resources that are not Namespaced; that is, they can be used by Pods in any Namespace. To learn more about which Kubernetes resources in your cluster are Namespaced and which are not, run the command kubectl api-resources.

How Helm Works

Did you wonder what happened when you executed the helm install command with a provided values file? To understand what’s going on, let’s take a look at the contents of a Helm chart, as shown in Figure 4-2. As we discuss these contents, it will also be helpful to look at the source code of the MySQL Helm chart you just installed.

Customizing a Helm release using a values.yaml file
Figure 4-2. Customizing a Helm release using a values.yaml file

Looking at the contents of a Helm chart, you’ll notice the following:

README file
This explains how to use the chart. These instructions are provided along with the chart in registries.
Chart.yaml file
This contains metadata about the chart such as its name, publisher, version, keywords, and any dependencies on other charts. These properties are useful when searching Helm registries to find charts.
values.yaml file
This lists out the configurable values supported by the chart and their default values. These files typically contain a good number of comments that explain the available options. For the Bitnami MySQL Helm chart, a lot of options are available, as we’ve noted.
templates directory
This contains Go templates that define the chart. The templates include a Notes.txt file used to generate the output you saw previously after executing the helm install command, and one or more YAML files that describe a pattern for a Kubernetes resource. These YAML files may be organized in subdirectories (for example, the template that defines a StatefulSet for MySQL primary replicas). Finally, a _helpers.tpl file describes how to use the templates. Some of the templates may be used multiple times or not at all, depending on the selected configuration values.

When you execute the helm install command, the Helm client makes sure it has an up-to-date copy of the chart you’ve named by checking with the source repository. Then it uses the template to generate YAML configuration code, overriding default values from the chart’s values.yaml file with any values you’ve provided. It then uses the kubectl command to apply this configuration to your currently configured Kubernetes cluster.

If you’d like to see the configuration that a Helm chart will produce before applying it, you can use the handy template command. It supports the same syntax as the install command:

helm template mysql bitnami/mysql -f values.yaml

Running this command will produce quite a bit of output, so you may want to redirect it to a file (append > values-template.yaml to the command) so you can take a longer look. Alternatively, you can look at the copy we have saved in the source code repository.

You’ll notice that several types of resources are created, as summarized in Figure 4-3. Many of the resources shown have been discussed, including the StatefulSets for managing the primary and secondary replicas, each with its own service (the chart also creates headless services that are not shown in the figure). Each Pod has its own PersistentVolumeClaim that is mapped to a unique PersistentVolume.

Figure 4-3 also includes resource types we haven’t discussed previously. Notice first that each StatefulSet has an associated ConfigMap that is used to provide a common set of configuration settings to its Pods. Next, notice the Secret named mysql, which stores passwords needed for accessing various interfaces exposed by the database nodes. Finally, a ServiceAccount resource is applied to every Pod created by this Helm release.

Let’s focus on some interesting aspects of this deployment, including the usage of labels, ServiceAccounts, Secrets, and ConfigMaps.

Deploying MySQL using the Bitnami Helm chart
Figure 4-3. Deploying MySQL using the Bitnami Helm chart

Labels

If you look through the output from the helm template, you’ll notice that the resources have a common set of labels:

  labels:
    app.kubernetes.io/name: mysql
    helm.sh/chart: mysql-8.8.8
    app.kubernetes.io/instance: mysql
    app.kubernetes.io/managed-by: Helm

These labels help identify the resources as being part of the mysql application and indicate that they are managed by Helm using a specific chart version. The labels are useful for selecting resources, which is often useful in defining configurations for other resources.

ServiceAccounts

Kubernetes clusters make a distinction between human users and applications for access control purposes. A ServiceAccount is a Kubernetes resource that represents an application and what it is allowed to access. For example, a ServiceAccount may be given access to some portions of the Kubernetes API, or access to one or more secrets containing privileged information such as login credentials. This latter capability is used in your Helm installation of MySQL to share credentials between Pods.

Every Pod created in Kubernetes has a ServiceAccount assigned to it. If you do not specify one, the default ServiceAccount is used. Installing the MySQL Helm chart creates a ServiceAccount called mysql. You can see the specification for this resource in the generated template:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: mysql
  namespace: default
  labels: ...
  annotations:
secrets:
  - name: mysql

As you can see, this ServiceAccount has access to a Secret called mysql, which we’ll discuss shortly. A ServiceAccount can also have an additional type of Secret known as an imagePullSecret. These Secrets are used when an application needs to use images from a private registry.

By default, a ServiceAccount does not have any access to the Kubernetes API. To give this ServiceAccount the access it needs, the MySQL Helm chart creates a Role specifying the Kubernetes resources and operations, and a RoleBinding to associate the ServiceAccount to the Role. We’ll discuss ServiceAccounts and role-based access in Chapter 5.

Secrets

As you learned in Chapter 2, a Secret provides secure access to information you need to keep private. Your mysql Helm release contains a Secret called mysql containing login credentials for the MySQL instances themselves:

apiVersion: v1
kind: Secret
metadata:
  name: mysql
  namespace: default
  labels: ...
type: Opaque
data:
  mysql-root-password: "VzhyNEhIcmdTTQ=="
  mysql-password: "R2ZtNkFHNDhpOQ=="
  mysql-replication-password: "bDBiTWVzVmVORA=="

The three passwords represent different types of access: the mysql-root-password provides administrative access to the MySQL node, while the mysql-replication-password is used for nodes to communicate for the purposes of data replication between nodes. The mysql-password is used by client applications to access the database to write and read data.

ConfigMaps

The Bitnami MySQL Helm chart creates Kubernetes ConfigMap resources to represent the configuration settings used for Pods that run the MySQL primary and secondary replica nodes. ConfigMaps store configuration data as key-value pairs. For example, the ConfigMap created by the Helm chart for the primary replicas looks like this:

apiVersion: v1
kind: ConfigMap
metadata:
  name: mysql-primary
  namespace: default
  labels: ...
data:
  my.cnf: |-
   
    [mysqld]
    default_authentication_plugin=mysql_native_password
    ...

In this case, the key is the name my.cnf, which represents a filename, and the value is a multiline set of configuration settings that represent the contents of a configuration file (which we’ve abbreviated here). Next, look at the definition of the StatefulSet for the primary replicas. Notice that the contents of the ConfigMap are mounted as a read-only file inside each template, according to the Pod specification for the StatefulSet (again, we’ve omitted some detail to focus on key areas):

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql-primary
  namespace: default
  labels: ...
spec:
  replicas: 1
  selector:
    matchLabels: ...
  serviceName: mysql-primary
  template:
    metadata:
      annotations: ...
      labels: ...
    spec:
      ...     
      serviceAccountName: mysql
      containers:
        - name: mysql
          image: docker.io/bitnami/mysql:8.0.26-debian-10-r60
          volumeMounts:
            - name: data
              mountPath: /bitnami/mysql
            - name: config
              mountPath: /opt/bitnami/mysql/conf/my.cnf
              subPath: my.cnf
      volumes:
        - name: config
          configMap:
            name: mysql-primary

Mounting the ConfigMap as a volume in a container results in the creation of a read-only file in the mount directory that is named according to the key and has the value as its content. For our example, mounting the ConfigMap in the Pod’s mysql container results in the creation of the file /opt/bitnami/mysql/conf/my.cnf.

This is one of several ways that ConfigMaps can be used in Kubernetes applications:

  • As described in the Kubernetes documentation, you could choose to store configuration data in more granular key-value pairs, which also makes it easier to access individual values in your application.

  • You can also reference individual key-value pairs as environment variables you pass to a container.

  • Finally, applications can access ConfigMap contents via the Kubernetes API.

More Configuration Options

Now that you have a Helm release with a working MySQL cluster, you can point an application to it, such as WordPress. Why not try seeing if you can adapt the WordPress deployment from Chapter 3 to point to the MySQL cluster you’ve created here?

For further learning, you could also compare your resulting configuration with that produced by the Bitnami WordPress Helm chart, which uses MariaDB instead of MySQL but is otherwise quite similar.

Updating Helm Charts

If you’re running a Helm release in a production environment, chances are you’re going to need to maintain it over time. You might want to update a Helm release for various reasons:

  • A new version of a chart is available.

  • A new version of an image used by your application is available.

  • You want to change the selected options.

To check for a new version of a chart, execute the helm repo update command. Running this command with no options looks for updates in all of the chart repositories you have configured for your Helm client:

helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "bitnami" chart repository
Update Complete. ⎈Happy Helming!⎈

Next, you’ll want to make any desired updates to your configured values. If you’re upgrading to a new version of a chart, make sure to check the release notes and documentation of the configurable values. It’s a good idea to test out an upgrade before applying it. The --dry-run option allows you to do this, producing similar values to the helm template command:

helm upgrade mysql bitnami/mysql -f values.yaml --dry-run

Using an Overlay Configuration File

One useful option you could use for the upgrade is to specify values you wish to override in a new configuration file, and apply both the new and old, something like this:

helm upgrade mysql bitnami/mysql \
  -f values.yaml -f new-values.yaml

Configuration files are applied in the order they appear on the command line, so if you use this approach, make sure your overridden values file appears after your original values file.

Once you’ve applied the upgrade, Helm sets about its work, updating only those resources in the release that are affected by your configuration changes. If you’ve specified changes to the Pod template for a StatefulSet, the Pods will be restarted according to the update policy specified for the StatefulSet, as we discussed in “StatefulSet lifecycle management”.

Uninstalling Helm Charts

When you are finished using your Helm release, you can uninstall it by name:

helm uninstall mysql

Note that Helm does not remove any of the PersistentVolumeClaims or PersistentVolumes that were created for this Helm chart, following the behavior of StatefulSets discussed in Chapter 3.

Using Helm to Deploy Apache Cassandra

Now let’s switch gears and look at deploying Apache Cassandra by using Helm. In this section, you’ll use another chart provided by Bitnami, so there’s no need to add another repository. You can find the implementation of this chart on GitHub. Helm provides a quick way to see the metadata about this chart:

helm show chart bitnami/cassandra

After reviewing the metadata, you’ll also want to learn about the configurable values. You can examine the values.yaml file in the GitHub repo, or use another option on the show command:

helm show values bitnami/cassandra

The list of options for this chart is shorter than the list for the MySQL chart, because Cassandra doesn’t have the concept of primary and secondary replicas. However, you’ll certainly see similar options for images, StorageClasses, security, liveness and readiness probes, and so on. Some configuration options are unique to Cassandra, such as those having to do with JVM settings and seed nodes (as discussed in Chapter 3).

One interesting feature of this chart is the ability to export metrics from Cassandra nodes. If you set metrics.enabled=true, the chart will inject a sidecar container into each Cassandra Pod that exposes a port that can be scraped by Prometheus. Other values under metrics configure what metrics are exported, the collection frequency, and more. While we won’t use this feature here, metrics reporting is a key part of managing data infrastructure we’ll cover in Chapter 6.

For a simple three-node Cassandra configuration, you could set the replica count to 3 and set other configuration values to their defaults. However, since you’re overriding only a single configuration value, this is a good time to take advantage of Helm’s support for setting values on the command line, instead of providing a values.yaml file:

helm install cassandra bitnami/cassandra --set replicaCount=3

As discussed previously, you can use the helm template command to check the configuration before installing it, or look at the file we’ve saved on GitHub. However, since you’ve already created the release, you can also use this command:

helm get manifest cassandra

Looking through the resources in the YAML, you’ll see that a similar set of infrastructure has been established, as shown in Figure 4-4.

The configuration includes the following:

  • A ServiceAccount referencing a Secret, which contains the password for the cassandra administrator account.

  • A single StatefulSet, with a headless Service used to reference its Pods. The Pods are spread evenly across the available Kubernetes Worker Nodes, which we’ll discuss in the next section. The Service exposes Cassandra ports used for intra-node communication (7000, with 7001 used for secure communication via TLS), administration via JMX (7199), and client access via CQL (9042).

Deploying Apache Cassandra using the Bitnami Helm chart
Figure 4-4. Deploying Apache Cassandra using the Bitnami Helm chart

This configuration represents a simple Cassandra topology, with all three nodes in a single Datacenter and rack. This simple topology reflects one of the limitations of this chart—it does not provide the ability to create a Cassandra cluster consisting of multiple Datacenters and racks. To create a more complex deployment, you’d have to install multiple Helm releases, using the same clusterName (in this case, you’re using the default name cassandra), but a different Datacenter and rack per deployment. You’d also need to obtain the IP address of a couple of nodes in the first Datacenter to use as additionalSeeds when configuring the releases for the other racks.

Affinity and Anti-Affinity

As shown in Figure 4-4, the Cassandra nodes are spread evenly across the Worker Nodes in your cluster. To verify this in your own Cassandra release, you could run something like the following:

kubectl describe pods | grep "^Name:" -A 3
Name:         cassandra-0
Namespace:    default
Priority:     0
Node:         kind-worker/172.20.0.7
--
Name:         cassandra-1
Namespace:    default
Priority:     0
Node:         kind-worker2/172.20.0.6
--
Name:           cassandra-2
Namespace:      default
Priority:       0
Node:           kind-worker3/172.20.0.5

As you can see, each Cassandra node is running on a different Worker Node. If your Kubernetes cluster has at least three Worker Nodes and no other workloads, you’ll likely observe similar behavior. While it is true that this even allocation could happen naturally in a cluster that has an even load across Worker Nodes, this is probably not the case in your production environment. However, to promote maximum availability of your data, we want to try to honor the intent of Cassandra’s architecture to run nodes on different machines in order to promote high availability.

To help guarantee this isolation, the Bitnami Helm chart uses Kubernetes’s affinity capabilities, specifically anti-affinity. If you examine the generated configuration for the Cassandra StatefulSet, you’ll see the following:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cassandra
  namespace: default
  labels: ...
spec:
  ...
  template:
    metadata:
      labels: ...
    spec:
      ...
      affinity:
        podAffinity:
         
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app.kubernetes.io/name: cassandra
                    app.kubernetes.io/instance: cassandra
                namespaces:
                  - "default"
                topologyKey: kubernetes.io/hostname
              weight: 1
        nodeAffinity:

As shown here, the Pod template specification lists three possible types of affinity, with only the podAntiAffinity being defined. What do these concepts mean?

Pod affinity
The preference that a Pod is scheduled onto a node where another specific Pod is running. For example, Pod affinity could be used to colocate a web server with its cache.
Pod anti-affinity
The opposite of Pod affinity—that is, a preference that a Pod not be scheduled on a node where another identified Pod is running. This is the constraint used in this example, as we’ll discuss shortly.
Node affinity
A preference that a Pod be run on a node with specific characteristics.

Each type of affinity can be expressed as either hard or soft constraints. These are known as requiredDuringSchedulingIgnoredDuringExecution and preferredDuringSchedulingIgnoredDuringExecution. The first constraint specifies rules that must be met before a Pod is scheduled on a node, while the second specifies a preference that the scheduler will attempt to meet but may relax if necessary in order to schedule the Pod.

IgnoredDuringExcecution implies that the constraints apply only when the Pods are first scheduled. In the future, new RequiredDuringExecution options will be added called requiredDuringSchedulingRequiredDuringExecution and requiredDuringSchedulingRequiredDuringExecution. These will ask Kubernetes to evict Pods (that is, move them to another node) that no longer meet the criteria—for example, by a change in their labels.

Looking at the preceding example, the Pod template specification for the Cassandra StatefulSet specifies an anti-affinity rule using the labels that are applied to each Cassandra Pod. The net effect is that Kubernetes will try to spread the Pods across the available Worker Nodes.

Those are the highlights of looking at the Bitnami Helm chart for Cassandra. To clean things up, uninstall the Cassandra release:

helm uninstall cassandra

If you don’t want to work with Bitnami Helm charts any longer, you can also remove the repository from your Helm client:

helm repo remove bitnami

More Kubernetes Scheduling Constraints

Kubernetes supports additional mechanisms for providing hints to its scheduler about Pod placement. One of the simplest is NodeSelectors, which is very similar to node affinity, but with a less expressive syntax that can match on one or more labels by using AND logic. Since you may or may not have the required privileges to attach labels to Worker Nodes in your cluster, Pod affinity is often a better option. Taints and tolerations are another mechanism that can be used to configure Worker Nodes to repel specific Pods from being scheduled on those nodes.

In general, you want to be careful to understand all of the constraints you’re putting on the Kubernetes scheduler from various workloads so as not to overly constrain its ability to place Pods. See the Kubernetes documentation for more information on scheduling constraints. We’ll also look at how Kubernetes allows you to plug in different schedulers in “Alternative Schedulers for Kubernetes”.

Helm, CI/CD, and Operations

Helm is a powerful tool focused on one primary task: deploying complex applications to Kubernetes clusters. To get the most benefit from Helm, you’ll want to consider how it fits into your larger CI/CD toolset:

  • Automation servers such as Jenkins automatically build, test, and deploy software according to scripts known as jobs. These jobs are typically run based on predefined triggers, such as a commit to a source repository. Helm charts can be referenced in jobs to install an application under test and its supporting infrastructure in a Kubernetes cluster.

  • IaC automation tools such as Terraform allow you to define templates and scripts that describe how to create infrastructure in a variety of cloud environments. For example, you could write a Terraform script that automates the creation of a new VPC within a specific cloud provider and the creation of a new Kubernetes cluster within that VPC. The script could then use Helm to install applications within the Kubernetes cluster.

While overlaps certainly occur in the capabilities these tools provide, you’ll want to consider the strengths and limitations of each as you construct your toolset. For this reason, we want to make sure to note that Helm has limitations when it comes to managing the operations of applications that it deploys. To get a good picture of the challenges involved, we spoke to a practitioner who has built assemblies of Helm charts to manage a complex database deployment. This discussion begins to introduce concepts like Kubernetes Custom Resource Definitions (CRDs) and the operator pattern, both of which we’ll cover in depth in Chapter 5.

As John Sanda notes in his commentary, Helm is a powerful tool for scripting the deployment of applications consisting of multiple Kubernetes resources, but can be less effective at managing more complex operational tasks. As you’ll see in the chapters to come, a common pattern used for data infrastructure and other complex applications is to use a Helm chart to deploy an operator, which can then in turn manage both the deployment and lifecycle of the application.

Summary

In this chapter, you’ve learned how a package management tool like Helm can help you manage the deployment of applications on Kubernetes, including your database infrastructure. Along the way, you’ve also learned how to use some additional Kubernetes resources like ServiceAccounts, Secrets, and ConfigMaps. Now it’s time to round out our discussion of running databases on Kubernetes. In the next chapter, we’ll take a deeper dive into managing database operations on Kubernetes by using the operator pattern.

Get Managing Cloud Native Data on Kubernetes now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.