Skip to Content
Mastering Spark with R
book

Mastering Spark with R

by Javier Luraschi, Kevin Kuo, Edgar Ruiz
October 2019
Beginner to intermediate
293 pages
6h 55m
English
O'Reilly Media, Inc.
Content preview from Mastering Spark with R

Chapter 1. Introduction

You know nothing, Jon Snow.

—Ygritte

With information growing at exponential rates, it’s no surprise that historians are referring to this period of history as the Information Age. The increasing speed at which data is being collected has created new opportunities and is certainly poised to create even more. This chapter presents the tools that have been used to solve large-scale data challenges. First, it introduces Apache Spark as a leading tool that is democratizing our ability to process large datasets. With this as a backdrop, we introduce the R computing language, which was specifically designed to simplify data analysis. Finally, this leads us to introduce sparklyr, a project merging R and Spark into a powerful tool that is easily accessible to all.

Chapter 2, Getting Started presents the prerequisites, tools, and steps you need to perform to get Spark and R working on your personal computer. You will learn how to install and initialize Spark, get introduced to common operations, and get your very first data processing and modeling task done. It is the goal of that chapter to help anyone grasp the concepts and tools required to start tackling large-scale data challenges which, until recently, were accessible to just a few organizations.

You then move into learning how to analyze large-scale data, followed by building models capable of predicting trends and discover information hidden in vast amounts of information. At which point, you will have ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Advanced Machine Learning with R

Advanced Machine Learning with R

Cory Lesmeister, Dr. Sunil Kumar Chinnamgari
Advanced R

Advanced R

Hadley Wickham
Regression Analysis with R

Regression Analysis with R

Giuseppe Ciaburro, Pierre Paquay, Manoj Kumar, Shaikh Salamatullah

Publisher Resources

ISBN: 9781492046363Errata Page