July 2017
Intermediate to advanced
796 pages
18h 55m
English
The best way of using SparkR is from RStudio. Your R program can be connected to a Spark cluster from RStudio using R shell, Rescript, or other R IDEs.
Option 1. Set SPARK_HOME in the environment (you can check https://stat.ethz.ch/R-manual/R-devel/library/base/html/Sys.getenv.html), load the SparkR package, and call sparkR.session as follows. It will check for the Spark installation, and, if not found, it will be downloaded and cached automatically:
if (nchar(Sys.getenv("SPARK_HOME")) < 1) { Sys.setenv(SPARK_HOME = "/home/spark") } library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
Option 2. You can also manually configure SparkR on RStudio. For doing so, create an R script and execute ...
Read now
Unlock full access