1. So, what is Spark, anyway?

This chapter covers

  • What Apache Spark is and its use cases
  • Basics of distributed technology
  • The four pillars of Spark
  • Storage and APIs: love the dataframe

When I was a kid in the 1980s, discovering programming through Basic and my Atari, I could not understand why we could not automate basic law enforcement activities such as speed control, traffic-light violations, and parking meters. Everything seemed pretty easy: the book I had said that to be a good programmer, you should avoid GOTO statements. And that’s what I did, trying to structure my code from the age of 12. However, there was no way I could imagine the volume of data (and the booming Internet of Things, or IoT) while I was developing my Monopoly-like ...

Get Spark in Action, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.