O'Reilly logo

Spark for Data Science by Bikramaditya Singhal, Srinivas Duvvuri

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Programming with SparkR

So far, we have understood the runtime model of SparkR and the basic data abstractions that provide the fault tolerance and scalability. We have understood how to access the Spark API from R shell or R studio. It's time to try out some basic and familiar operations:

> > //Open the shell > > //Try help(package=SparkR) if you want to more information > > df <- createDataFrame(iris) //Create a Spark DataFrame > df //Check the type. Notice the column renaming using underscore SparkDataFrame[Sepal_Length:double, Sepal_Width:double, Petal_Length:double, Petal_Width:double, Species:string] > > showDF(df,4) //Print the contents of the Spark DataFrame +------------+-----------+------------+-----------+-------+ |Sepal_Length|Sepal_Width|Petal_Length|Petal_Width|Species| ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required