Appendix B. Cloudera’s Distribution for Hadoop
Cloudera’s Distribution for Hadoop is based on the most recent stable version of Apache Hadoop with numerous patches, backports, and updates. Cloudera shares this distribution in a number of different formats: compressed tar files, RPMs, Debian packages, and Amazon EC2 AMIs. Cloudera’s Distribution for Hadoop is free, released under the Apache 2.0 license and available at http://www.cloudera.com/hadoop/.
Cloudera has an online configurator at http://www.cloudera.com/configurator to make setting up a Hadoop cluster easy (Figure B-1). The configurator has a simple wizard-like interface that asks targeted questions about your cluster. When you’ve finished, the configurator generates customized Hadoop packages and places them in a package repository for you. You can manage any number of clusters and return at a later time to update your active configurations.
To simplify package management, Cloudera shares RPMs from a
yum repository and Debian packages from an
apt repository. Cloudera’s Distribution for Hadoop enables you to install and configure Hadoop on each machine in your cluster by running a single, simple command. Kickstart users benefit even more by being able to commission entire Hadoop clusters automatically ...