July 2017
Intermediate to advanced
796 pages
18h 55m
English
To provide more enhancement and additional big data processing capabilities, Spark can be configured and run on top of existing Hadoop-based clusters. The core APIs in Spark, on the other hand, are written in Java, Scala, Python, and R. Compared to MapReduce, with the more general and powerful programming model, Spark also provides several libraries that are part of the Spark ecosystems for additional capabilities for general-purpose data processing and analytics, graph processing, large-scale structured SQL, and Machine Learning (ML) areas.
The Spark ecosystem consists of the following components as shown (for details please refer Chapter 16, Spark Tuning):
Read now
Unlock full access