O'Reilly logo

Hadoop 2.x Administration Cookbook by Gurmukh Singh

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

YARN label-based scheduling

In this recipe, we will configure YARN label-based scheduling. In a cluster, there can be a mixture of nodes with different configurations, some with more memory and CPU compared to other nodes in the cluster.

If we want to control which set of nodes a job executes, we need to assign labels to the nodes. A typical case could be that you want to run a Spark streaming job and want that to execute on nodes with high memory. For such a situation, we will configure the queue and assign a set of nodes for that, so that if a job is submitted to that queue, it executes on the nodes which have higher configuration in terms of memory and cores.

Getting ready

Make sure that the user has a running cluster with at least two Datanodes ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required