Spark framework and schedulers

The following diagram captures the various components of the Spark framework and the variety of scheduling modes in which it could be deployed:

The preceding diagram has all the basic components of the Spark ecosystem, though over a period of time some have evolved/deprecated, which we will get the users acquainted with in due course. These are the basic components of the Spark framework:

  • Spark core: As the name suggests, this is the core control unit of Spark framework. It predominantly handles the scheduling and management of tasks. These using a spark abstraction called resilient distributed dataset (RDDs ...

Get Practical Real-time Data Processing and Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.