Programming with SparkR

So far, we have understood the runtime model of SparkR and the basic data abstractions that provide the fault tolerance and scalability. We have understood how to access the Spark API from R shell or R studio. It's time to try out some basic and familiar operations:

> > //Open the shell > > //Try help(package=SparkR) if you want to more information > > df <- createDataFrame(iris) //Create a Spark DataFrame > df //Check the type. Notice the column renaming using underscore SparkDataFrame[Sepal_Length:double, Sepal_Width:double, Petal_Length:double, Petal_Width:double, Species:string] > > showDF(df,4) //Print the contents of the Spark DataFrame +------------+-----------+------------+-----------+-------+ |Sepal_Length|Sepal_Width|Petal_Length|Petal_Width|Species| ...

Get Spark for Data Science now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.