O'Reilly logo

Hadoop 2.x Administration Cookbook by Gurmukh Singh

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Configure HDFS cache

In Hadoop, centralized cache management is an explicit mechanism for caching the most frequently used files. Users can configure the path to be cached by HDFS, which prevents them from being evicted from memory. Namenode is responsible for coordinating all the Datanode caches in the cluster and periodically receives a cache report.

Getting ready

For this recipe, you will again need a running cluster with at least the HDFS daemons running in the cluster.

How to do it...

  1. Connect to the master1.cyrus.com master node and switch to user hadoop.
  2. The first step is to define a cache pool, which is a collection of cache directives. Refer to the following command and screenshot:
    $ hdfs cacheadmin -addPool sales
    
  3. Then, we need to define a ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required