O'Reilly logo

Architecting Modern Data Platforms by Lars George, Paul Wilkinson, Ian Buss, Jan Kunigk

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 7. Provisioning Clusters

This chapter discusses the provisioning and configuration of Hadoop cluster nodes. If you are using a cloud environment, then Part III is the more suitable section to read, as far as provisioning is concerned. In any event, the vast majority of Hadoop nodes run on Linux, so the operating system (OS)–related topics in this chapter still apply.

Operating Systems

The first task after acquiring physical hardware in the form of rack-mountable servers (for example, a 19” rack server or blades) is to provision the OS. There are many options, some dating back decades, which allow you to automate that process considerably. Separate technologies are often used for each step of the process:

Server bootstrap

The initial phase of a machine provisioning process is to automatically assign it an IP address and install the OS bootstrap executable. The most common technology used for this is called Preboot Execution Environment (PXE), which was introduced as part of the larger open industry standard Wired for Management (WfM). The latter also included the familiar Wake-on-LAN (WoL) standard. WfM was replaced by the Intelligent Platform Management Interface (IMPI) in 1998.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required