O'Reilly logo

Hadoop Operations and Cluster Management Cookbook by Shumin Guo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 7. Tuning a Hadoop Cluster for Best Performance

In this chapter, we will cover:

  • Benchmarking and profiling a Hadoop cluster
  • Analyzing job history with Rumen
  • Benchmarking a Hadoop cluster with GridMix
  • Using Hadoop Vaidya to identify performance problems
  • Balancing data blocks for a Hadoop cluster
  • Choosing a proper block size
  • Using compression for input and output
  • Configuring speculative execution
  • Setting proper number of map and reduce slots for TaskTracker
  • Tuning the JobTracker configuration
  • Tuning the TaskTracker configuration
  • Tuning shuffle, merge, and sort parameters
  • Configuring memory for a Hadoop cluster
  • Setting proper number of parallel copies
  • Tuning JVM parameters
  • Configuring JVM Reuse
  • Configuring the reducer initialization time

Introduction

Hadoop ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required