O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Monitoring Hadoop

Book Description

Get to grips with the intricacies of Hadoop monitoring using the power of Ganglia and Nagios

In Detail

With the exponential growth of data and many enterprises crunching more and more data, Hadoop as a data platform has gained a lot of popularity. The Hadoop platform needs to be monitored with respect to how it works and functions. There is an ever-increasing need to keep the Hadoop platform clean and healthy.

This book will help you to integrate Hadoop and Nagios in a seamless and easy way. At the start, the book covers the basics of operating system logging and monitoring. Getting to grips with the characteristics of Hadoop monitoring, metrics, and log collection will help Hadoop users, especially Hadoop administrators, diagnose and troubleshoot clusters better. In essence, the book teaches you how to set up an all-inclusive and robust monitoring system for the Hadoop platform. The book also serves as a quick reference to the various metrics available in Hadoop.

Concluding with the visualization of Hadoop metrics, you will get acquainted with the workings of Hadoop in a short span of time with the help of step-by-step instructions in each chapter.

What You Will Learn

  • Install Nagios and Ganglia and understand logging at the operating system level
  • Create and configure Nagios nodes for monitoring with custom checks
  • Monitor Hadoop daemons such as NameNode, DataNode, JobTracker, and so on
  • Configure logs for various daemons and set up audits for the options done on the cluster
  • Track important parameters for the File System, MapReduce, and other counters
  • Set up Nagios master and client nodes with checks for the system and applications running on it
  • Configure the Hadoop metrics collection and visualize it for nontechnical users
  • Understand the communication between different daemons and protocols and the ports they use

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Table of Contents

  1. Monitoring Hadoop
    1. Table of Contents
    2. Monitoring Hadoop
    3. Credits
    4. About the Author
    5. About the Reviewers
    6. www.PacktPub.com
      1. Support files, eBooks, discount offers, and more
        1. Why subscribe?
        2. Free access for Packt account holders
    7. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Errata
        2. Piracy
        3. Questions
    8. 1. Introduction to Monitoring
      1. The need for monitoring
      2. The monitoring tools available in the market
        1. Nagios
          1. Nagios architecture
          2. Prerequisites for installing and configuring Nagios
            1. Prerequisites
          3. Installing Nagios
          4. Web interface configuration
          5. Nagios plugins
          6. Verification
          7. Configuration files
          8. Setting up monitoring for clients
        2. Ganglia
          1. Ganglia components
          2. Ganglia installation
      3. System logging
        1. Collection
        2. Transportation
        3. Storage
        4. Alerting and analysis
        5. The syslogd and rsyslogd daemons
      4. Summary
    9. 2. Hadoop Daemons and Services
      1. Hadoop daemons
        1. NameNode
        2. DataNode and TaskTracker
        3. Secondary NameNode
        4. JobTracker and YARN daemons
        5. The communication between daemons
      2. YARN framework
        1. Common issues faced on Hadoop cluster
        2. Host-level checks
        3. Nagios server
        4. Configuring Hadoop nodes for monitoring
      3. Summary
    10. 3. Hadoop Logging
      1. The need for logging events
      2. System logging
      3. Logging levels
      4. Logging in Hadoop
        1. Hadoop logs
        2. Hadoop log level
        3. Hadoop audit
      5. Summary
    11. 4. HDFS Checks
      1. HDFS overview
      2. Nagios master configuration
      3. The Nagios client configuration
      4. Summary
    12. 5. MapReduce Checks
      1. MapReduce overview
      2. MapReduce control commands
      3. MapReduce health checks
      4. Nagios master configuration
      5. Nagios client configuration
      6. Summary
    13. 6. Hadoop Metrics and Visualization Using Ganglia
      1. Hadoop metrics
      2. Metrics contexts
        1. Named contexts
      3. Metrics system design
      4. Metrics configuration
      5. Configuring Metrics2
      6. Exploring the metrics contexts
      7. Hadoop Ganglia integration
        1. Hadoop metrics configuration for Ganglia
        2. Setting up Ganglia nodes
      8. Hadoop configuration
        1. Metrics1
        2. Metrics2
      9. Ganglia graphs
      10. Metrics APIs
        1. The org.apache.hadoop.metrics package
        2. The org.apache.hadoop.metrics2 package
      11. Summary
    14. 7. Hive, HBase, and Monitoring Best Practices
      1. Hive monitoring
      2. Hive metrics
        1. HBase monitoring
      3. HBase Nagios monitoring
      4. HBase metrics
      5. Monitoring best practices
      6. The Filter class
      7. Nagios and Ganglia best practices
      8. Summary
    15. Index