Resourcemanager HA using ZooKeeper

In this recipe, we will be covering Resourcemanager (RM) high availability. In a Hadoop cluster, if the RM goes offline for any reason, all the jobs on the cluster will fail. In production, there will be critical jobs that might be running for a long time and it does not make sense to start them again due to the failure of RM. HA for Resourcemanager was introduced in Hadoop 2.4 and it supports both manual and automatic failover.

Similar to Namenode HA discussed in the earlier recipes, Resourcemanager HA also has only one active node at any given point of time. The failover is either initiated by an admin command or by using ZooKeeper for automatic failover.

Resourcemanager HA can be configured by either using ...

Get Hadoop 2.x Administration Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.