O'Reilly logo

R Data Analysis Cookbook - Second Edition by Kuntal Ganguly

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Working with in-memory data processing with Apache Spark

Apache Spark is a fast, general-purpose, fault-tolerant framework for interactive and iterative computations on large, distributed datasets. It supports a wide variety of data sources as well as storage layers.

It provides unified data access to combine different data formats, streaming data, and defining complex operations using high-level, composable operators. You can develop your applications interactively using Scala, Python, or R shell. In this recipe, you will learn various way of interacting with Apache Spark through R for data handling, along with predictive analysis on datasets.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required