Spark architecture

Let's have a look at Apache Spark architecture, including a high level overview and a brief description of some of the key software components.

High level overview

At the high level, Apache Spark application architecture consists of the following key software components and it is important to understand each one of them to get to grips with the intricacies of the framework:

  • Driver program
  • Master node
  • Worker node
  • Executor
  • Tasks
  • SparkContext
  • SQL context
  • Spark session

Here's an overview of how some of these software components fit together within the overall architecture:

High level overview

Figure 1.15: Apache Spark application architecture - Standalone mode ...

Get Learning Apache Spark 2 now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.