In this recipe, we will configure logs for the HDFS and YARN, which is very important for troubleshooting and diagnosis of job failures.
For larger clusters, it is important to manage logs in terms of disk space usage, ease of retrieval, and performance. It is always recommended to store logs on separate hard disks and that too on RAIDed disks for performance. The disk thats used by Namenode or Datanodes for metadata or HDFS blocks must not be shared with for logs.
To complete the recipe, the user must have a running cluster with HDFS and YARN configured and have played around with Chapter 1, Hadoop Architecture and Deployment and Chapter 2, Maintain Hadoop Cluster HDFS to understand things better.