YARN cluster mode
In the YARN cluster mode, the Driver runs on a node inside the cluster (typically where the application master is). Client first contacts the resource manager requesting resources to run the Spark job. The resource manager allocates a container (container zero) and responds to the client. The client then submits the code to the cluster and then launches the Driver and Spark application master in the container zero. The Driver runs along with the application master and the Spark application master, and then creates the executors on the containers allocated by the resource manager. The YARN containers can be on any node in the cluster controlled by the node manager. So, all allocations are managed by the resource manager. ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access