About this Book

Apache Spark is a general data processing framework. That means you can use it for all kinds of computing tasks. And that means any book on Apache Spark needs to cover a lot of different topics. We’ve tried to describe all aspects of using Spark: from configuring runtime options and running standalone and interactive jobs, to writing batch, streaming, or machine learning applications. And we’ve tried to pick examples and example data sets that can be run on your personal computer, that are easy to understand, and that illustrate the concepts well.

We hope you’ll find this book and the examples useful for understanding how to use and run Spark and that it will help you write future, production-ready Spark applications.

Who should ...

Get Spark in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.