O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Dataframes

Spark RDDs can be a bit difficult to work with, so in recent versions of Spark, Dataframe abstractions have been built on top of RDDs, which allows analysts to view data in ways that they are used to looking at them, for example, being able to view them as tables and lists. This allows many high level languages (such as R and Python) to utilize syntax which is familiar to them and to be able to integrate optimized code, which is on par with native Scala or Spark SQL.

Databrick is a company founded by the creators of Apache Spark, which offers a free environment for running Spark programs in the cloud utilizing R, Java, Scala, or Python.

I will be using the Databricks environment for illustrating the necessary coding needed to run ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required