Skip to Content
Practical Predictive Analytics
book

Practical Predictive Analytics

by Ralph Winters
June 2017
Beginner to intermediate content levelBeginner to intermediate
576 pages
15h 22m
English
Packt Publishing
Content preview from Practical Predictive Analytics

Simulation

We will end up building this Spark dataframe via simulation. This will take up a good chunk of this chapter. I feel this is a better way to go rather than importing an existing public dataset in which you cannot control the makeup of the data. With a simulated dataset, you are free to size it however you like (subject to account restrictions).

However, you are always free to import whatever dataset you would like and the analytic concepts that follow will be the same.

  1. Preliminaries first, you will need to register and log on to your databricks account.
  2. Next, create a cluster. Give it a name, such as MyCluster.
  3. To conform with the examples in this chapter, make sure you choose Spark 2.1. This is very important. Since Spark is an ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Superstream: Analytics Engineering

Data Superstream: Analytics Engineering

Alistair Croll, Anna Filippova, Emilie Schario, Lewis Davies, Jacob Frackson, Benn Stancil, Nick Acosta, Elizabeth Caley
R: Predictive Analysis

R: Predictive Analysis

Tony Fischetti, Eric Mayor, Rui Miguel Forte
Python: Advanced Predictive Analytics

Python: Advanced Predictive Analytics

Ashish Kumar, Joseph Babcock

Publisher Resources

ISBN: 9781785886188Supplemental Content