O'Reilly logo

Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark by Zubair Nabi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

© Zubair Nabi 2016

Zubair Nabi, Pro Spark Streaming, 10.1007/978-1-4842-1479-4_2

2. Introduction to Spark

Zubair Nabi

(1)Lahore, Pakistan

There are two major products that came out of Berkeley: LSD and UNIX. We don’t believe this to be a coincidence.

—Jeremy S. Anderson

Like LSD and Unix, Spark was originally conceived in 20091 at Berkeley,2 in the same Algorithms, Machines, and People (AMP) Lab that gave the world RAID, Mesos, RISC, and several Hadoop enhancements. It was initially pitched to the academic community as a distributed framework built from the ground up atop the Mesos cross-platform scheduler (then called Nexus). Spark can be thought of as an in-memory variant of Hadoop, with the following key differences:

  • Directed acyclic graph

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required