November 2016
Intermediate to advanced
472 pages
16h 24m
English
This chapter covers
Apache Spark is usually defined as a fast, general-purpose, distributed computing platform. Yes, it sounds a bit like marketing speak at first glance, but we could hardly come up with a more appropriate label to put on the Spark box.
Apache Spark really did bring a revolution to the big data space. Spark makes efficient use of memory and can execute equivalent jobs 10 to 100 times faster than Hadoop’s MapReduce. On top of that, Spark’s creators managed to abstract away the fact that you’re dealing with a cluster of machines, and instead ...
Read now
Unlock full access