Breaking down Spark

We start with the main components of Spark, which are depicted in the following diagram:

Now, let's explore all the main components of Spark:

  • Spark Core: This is the foundation and the execution engine of the overall platform. It provides task distribution, scheduling, and in-memory computing. As its name implies, Spark Core is where all the other functionalities are built on top. It can also be exposed through an API of multiple languages, including Python, Java, Scala, and R.
  • Spark SQL: This is a component built upon Spark Core that introduces a high-level data abstraction called dataframes. We will talk about data ...

Get Python Machine Learning By Example - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.