Resource management in Hadoop
As a Hadoop administrator, one important activity that you need to do is to ensure that all of the resources are used in the most optimal manner inside the cluster. When I refer to a resource, I mean the CPU time, the memory allocated to jobs, the network bandwidth utilization, and storage space consumed. Administrators can achieve that by balancing workloads on the jobs that are running in the cluster environment. When a cluster is set up, it may run different types of jobs, requiring different levels of time- and complexity-based SLAs. Fortunately, Apache Hadoop provides a built-in scheduler for scheduling jobs to allow administrators to prioritize different jobs as per the SLAs defined. So, overall resources ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access