2. Architecture and flow

This chapter covers

  • Building a mental model of Spark for a typical use case
  • Understanding the associated Java code
  • Exploring the general architecture of a Spark application
  • Understanding the flow of data

In this chapter, you will build a mental model of Apache Spark. A mental model is an explanation of how something works in the real world, using your thought process and following diagrams. The goal of this chapter is to help you define your own ideas about the thought process I will walk you through. I will use a lot of diagrams and some code. It would be extremely pretentious to build a unique Spark mental model; this model will describe a typical scenario involving loading, processing, and saving data. You will ...

Get Spark in Action, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.