Chapter 17. Dynamic Clustering in the Cloud
This chapter continues where Chapter 16 left off and explains how clusters don't have to consist of a fixed number of slaves but rather can be dynamic in nature. Once you've learned all about dynamic clustering, we introduce you to a dynamic set of resources called cloud computing. We then move on to a practical implementation of one cloud computing service: the Amazon Elastic Compute Cloud (EC2). We finish off the chapter by explaining how you can configure your own set of servers on Amazon EC2 for use as a cluster.
While at most organizations it is still standard practice for most ETL developers to have only one or two servers to work with, it's becoming more common to have a whole set of machines available as a set of general compute resources. This section describes how Kettle clustering can enable you to take advantage of a dynamic pool of computer resources.
Even before terms such as cloud computing and virtual machines became popular, initiatives like SETI@Home were already utilizing computer resources dynamically. SETI@Home was one of the very first popular distributed dynamic clusters; people all over the world contributed processing power to help the Search for Extra Terrestrial Intelligence. The SETI@Home cluster is dynamic in configuration because the number of participating nodes is constantly changing. In fact, SETI@Home is implemented as a screensaver so it's impossible to say up-front how many machines participate ...