Programming with SparkR

So far, we have understood the runtime model of SparkR and the basic data abstractions that provide the fault tolerance and scalability. We have understood how to access the Spark API from R shell or R studio. It's time to try out some basic and familiar operations:

> > //Open the shell > > //Try help(package=SparkR) if you want to more information > > df <- createDataFrame(iris) //Create a Spark DataFrame > df //Check the type. Notice the column renaming using underscore SparkDataFrame[Sepal_Length:double, Sepal_Width:double, Petal_Length:double, Petal_Width:double, Species:string] > > showDF(df,4) //Print the contents of the Spark DataFrame +------------+-----------+------------+-----------+-------+ |Sepal_Length|Sepal_Width|Petal_Length|Petal_Width|Species| ...

Get Spark for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Spark for Data Science by Srinivas Duvvuri, Bikramaditya Singhal

Programming with SparkR

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly