May 2015
Intermediate to advanced
240 pages
5h 30m
English
The following glossary terms are used in this book. The definitions provide additional explanation and pointers to more information about these concepts.
Apache Spark™
An open source, high-performing clustered computing environment that makes use of in-memory primitives. It is used for processing large volumes of data in real-time. More information is available at http://en.wikipedia.org/wiki/Apache_Spark.
Apache Hadoop™
An open source computing environment that combines distributed storage with various processing engines to support the processing of large volumes in batches. More information is available at http://en.wikipedia.org/wiki/Apache_Hadoop.
Application
A collection of software programs that together perform a specific ...