Chapter 1. Architecture and Installation

This chapter intends to provide and describe the big-picture around Spark, which includes Spark architecture. You will be taken from the higher-level details of the framework to installing Spark and writing your very first program on Spark.

We'll cover the following core topics in this chapter. If you are already familiar with these topics please feel free to jump to the next chapter on Spark: Resilient Distributed Datasets (RDDs):

Apache Spark architecture overview:

  • Apache Spark deployment
  • Installing Apache Spark
  • Writing your first Spark program
  • Submitting applications

Apache Spark architecture overview

Apache Spark is being an open source distributed data processing engine for clusters, which provides a unified ...

Get Learning Apache Spark 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.