O'Reilly logo

Architecting Modern Data Platforms by Lars George, Paul Wilkinson, Ian Buss, Jan Kunigk

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 17. Automated Provisioning

One of the main attractions of using an IaaS platform is the prospect of fast, fully automated and repeatable cluster provisioning, entirely driven by configuration—an approach often referred to as infrastructure as code. In this chapter, we look at what we need to do to automate the process of deploying clusters, both long-lived and transient, and at some of the special considerations to take into account when operating in the cloud, such as integrating with security, shared metadata services, and growing and shrinking clusters.

Long-Lived Clusters

For the purposes of this discussion, long-lived cluster deployments are those with a life cycle that is governed outside of a specific workload or transient use case (see also “Cluster Life Cycle Models”). They may be used for multiple workloads, tenants, and use cases. Although their tenure can vary, typically such clusters have lifetimes measured in weeks to months rather than days. Standing up a cluster like this normally entails a few phases, as shown in Figure 17-1.

amdp 1701
Figure 17-1. The five provisioning phases for long-lived clusters

Before examining each of the phases, we first cover the configuration and templating that should drive any automation solution.

Note

Although this chapter focuses on cloud deployments, the general automation principles can—and should—also be used for on-premises deployments. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required