This chapter discusses the provisioning and configuration of Hadoop cluster nodes. If you are using a cloud environment, then Part III is the more suitable section to read, as far as provisioning is concerned. In any event, the vast majority of Hadoop nodes run on Linux, so the operating system (OS)–related topics in this chapter still apply.
The first task after acquiring physical hardware in the form of rack-mountable servers (for example, a 19” rack server or blades) is to provision the OS. There are many options, some dating back decades, which allow you to automate that process considerably. Separate technologies are often used for each step of the process:
The initial phase of a machine provisioning process is to automatically assign it an IP address and install the OS bootstrap executable. The most common technology used for this is called Preboot Execution Environment (PXE), which was introduced as part of the larger open industry standard Wired for Management (WfM). The latter also included the familiar Wake-on-LAN (WoL) standard. WfM was replaced by the Intelligent Platform Management Interface (IMPI) in 1998.