Apache Spark

Apache Spark is a fast and general-purpose cluster computing system, initially developed as AMPLab/UC Berkeley as part of the Berkeley Data Analytics Stack (BDAS) (http://en.wikipedia.org/wiki/UC_Berkeley). It provides high-level APIs for the following programming languages that make large and concurrent parallel jobs easy to write and deploy [12:11]:

Note

The link to the latest information

The URLs as any reference to Apache Spark may change in future versions.

The core element of Spark is a resilient distributed dataset (RDD), which is a collection ...

Get Scala:Applied Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.