To speed up the creation of Hadoop clusters in the cloud, you can create an image for one or more representative cluster instances with software already installed and, for the most part, configured. Chapter 16 describes the process. Still, there are configuration steps that can only be done once the cluster instances are running, and the scripts here can automate the work.
A quick glance at the scripts should reveal that even this small slice of automation is not trivial. You might consider borrowing the techniques used here and implementing them using a different scripting language or framework. Central to them is the ability to establish SSH connections into and within the cluster, so look for frameworks that help in that regard, such as Fabric.
Code is available at this book’s code repository.
The Hadoop installation process in Chapter 9 and elsewhere includes the creation of user accounts specific to services like HDFS, YARN, and ZooKeeper. Those accounts can already be in place in an image, but for greater security they should be configured with unique SSH keys, so that instances in one cluster cannot access instances in other clusters that were built from the same image.
The Bash script in Example B-1 can be run from your local computer to orchestrate the creation of SSH key pairs on the manager instance of a Hadoop cluster for each Hadoop account, and the distribution of public keys ...