Book description
A complete, hands-on guide to building and maintaining large Apache Hadoop clusters using Cloudera Manager and CDH5
In Detail
Apache Hadoop is an open source distributed computing technology that assists users in processing large volumes of data with relative ease, helping them to generate tremendous insights into their data. Cloudera, with their open source distribution of Hadoop, has made data analytics on big data possible and accessible to anyone interested.
This book fully prepares you to be a Hadoop administrator, with special emphasis on Cloudera's CDH. It provides step-by-step instructions on setting up and managing a robust Hadoop cluster running CDH5. This book will also equip you with an understanding of tools such as Cloudera Manager, which is currently being used by many companies to manage Hadoop clusters with hundreds of nodes. You will learn how to set up security using Kerberos. You will also use Cloudera Manager to set up alerts and events that will help you monitor and troubleshoot cluster issues.
What You Will Learn
- Understand the Apache Hadoop architecture and the future of distributed processing frameworks
- Use HDFS and MapReduce for all file-related operations
- Install and configure CDH to bring up an Apache Hadoop cluster
- Configure HDFS High Availability and HDFS Federation to prevent single points of failure
- Install and configure Cloudera Manager to perform administrator operations
- Implement security by installing and configuring Kerberos for all services in the cluster
- Add, remove, and rebalance nodes in a cluster using cluster management tools
- Understand and configure the different backup options to back up your HDFS
Publisher resources
Table of contents
-
Cloudera Administration Handbook
- Table of Contents
- Cloudera Administration Handbook
- Credits
- Notice
- About the Author
- About the Reviewers
- www.PacktPub.com
- Preface
- 1. Getting Started with Apache Hadoop
- 2. HDFS and MapReduce
-
3. Cloudera's Distribution Including Apache Hadoop
- Getting started with CDH
- Understanding the CDH components
- Installing CDH
- Installing the CDH components
- Summary
- 4. Exploring HDFS Federation and Its High Availability
- 5. Using Cloudera Manager
- 6. Implementing Security Using Kerberos
- 7. Managing an Apache Hadoop Cluster
- 8. Cluster Monitoring Using Events and Alerts
- 9. Configuring Backups
- Index
Product information
- Title: Cloudera Administration Handbook
- Author(s):
- Release date: July 2014
- Publisher(s): Packt Publishing
- ISBN: 9781783558964
You might also like
book
Linux Administration Cookbook
Over 100 recipes to get up and running with the modern Linux administration ecosystem Key Features …
video
Introduction to Apache HBase Operations
HBase master Jonathan Hsieh provides a complete overview of Apache HBase operations in this course designed …
book
Architecting HBase Applications
HBase is a remarkable tool for indexing mass volumes of data, but getting started with this …
book
HBase: The Definitive Guide
If you're looking for a scalable storage solution to accommodate a virtually endless amount of data, …