One of the main attractions of using an IaaS platform is the prospect of fast, fully automated and repeatable cluster provisioning, entirely driven by configuration—an approach often referred to as infrastructure as code. In this chapter, we look at what we need to do to automate the process of deploying clusters, both long-lived and transient, and at some of the special considerations to take into account when operating in the cloud, such as integrating with security, shared metadata services, and growing and shrinking clusters.
For the purposes of this discussion, long-lived cluster deployments are those with a life cycle that is governed outside of a specific workload or transient use case (see also “Cluster Life Cycle Models”). They may be used for multiple workloads, tenants, and use cases. Although their tenure can vary, typically such clusters have lifetimes measured in weeks to months rather than days. Standing up a cluster like this normally entails a few phases, as shown in Figure 17-1.
Before examining each of the phases, we first cover the configuration and templating that should drive any automation solution.
Although this chapter focuses on cloud deployments, the general automation principles can—and should—also be used for on-premises deployments. ...