Chapter 16. Using Images for Cluster Management

It is worthwhile to go through the exercise of building out a Hadoop cluster “by hand,” as laid out in Part III, in order to understand how instance allocation, networking, and security rules all play their parts. Once you are building clusters over and over again, though, you’ll want to be much more efficient.

You don’t usually need to establish virtual networks, routing, and security rules for every new cluster. They can all happily coexist within one or a few networks and abide by the same rules. Each cluster needs its own instances, but starting those up is also straightforward, either by using a provider console or by a single API call. It’s Hadoop installation and configuration that takes most of the time.

One method to speed things up is to create images for cluster instances with most of the installation and configuration work baked in. Instances based on those images can already have Hadoop components almost ready to go, requiring only a minimum of additional configuration to become fully functional.

Stamping out cluster instances like this is not only faster, but results in fewer mistakes or unintentional variations. It also lends itself to automation; you could imagine it as part of a complete automatic process for building Hadoop clusters.

Images can be shared. That means that a valuable base image for creating Hadoop cluster instances can benefit not just who built it, but their friends, coworkers, and even ...

Get Moving Hadoop to the Cloud now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.