Appendix D. Distributions
There are more choices to install HBase than using the Apache releases. Here we list what is available alternatively.
Cloudera’s Distribution Including Apache Hadoop
Cloudera’s Distribution including Apache Hadoop (hereafter CDH) is based on the most recent stable version of Apache Hadoop with numerous patches, backports, and updates. Cloudera makes the distribution available in a number of different formats: source and binary tar files, RPMs, Debian packages, VMware images, and scripts for running CDH in the cloud. CDH is free, released under the Apache 2.0 license and available at http://www.cloudera.com/hadoop/.
To simplify deployment, Cloudera hosts packages on public yum and apt repositories. CDH enables you to install and configure Hadoop, and HBase, on each machine using a single command. Kickstart users can commission entire Hadoop clusters without manual intervention.
CDH manages cross-component versions and provides a stable platform with a compatible set of packages that work together. As of CDH3, the following packages are included, many of which are covered elsewhere in this book:
- HDFS
Self-healing distributed filesystem
- MapReduce
Powerful, parallel data processing framework
- Hadoop Common
A set of utilities that support the Hadoop subprojects
- HBase
Hadoop database for random read/write access
- Hive
SQL-like queries and tables on large data sets
- Pig
Dataflow language and compiler
- Oozie
Workflow for interdependent Hadoop jobs
- Sqoop
Integrates databases and data warehouses ...
Get HBase: The Definitive Guide now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.