Spark framework and schedulers

The following diagram captures the various components of the Spark framework and the variety of scheduling modes in which it could be deployed:

The preceding diagram has all the basic components of the Spark ecosystem, though over a period of time some have evolved/deprecated, which we will get the users acquainted with in due course. These are the basic components of the Spark framework:

  • Spark core: As the name suggests, this is the core control unit of Spark framework. It predominantly handles the scheduling and management of tasks. These using a spark abstraction called resilient distributed dataset (RDDs ...

Get Practical Real-time Data Processing and Analytics now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.