Apache Spark

Apache Spark is a fast and general-purpose cluster computing system, initially developed as AMPLab/UC Berkeley as part of the Berkeley Data Analytics Stack (BDAS) (http://en.wikipedia.org/wiki/UC_Berkeley). It provides high-level APIs for the following programming languages that make large and concurrent parallel jobs easy to write and deploy [12:11]:

Note

The link to the latest information

The URLs as any reference to Apache Spark may change in future versions.

The core element of Spark is a resilient distributed dataset (RDD), which is a collection ...

Get Scala for Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.