O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

A note on parallel processing

Spark will take advantage of parallel processing algorithms as much as possible. Extracting training and test data samples is a prime example of this, since extracting the training data set can be performed independently of the training data.

So, in Databricks, you will be able to visually see them run at the same time via a progress bar. In cases in which a block of code is dependent upon another block of code, you will see the execution begin, but it will be in wait state until the needed prior blocks have been completed.

The takeaway from this is that you should always try to develop your analytics code in a manner which utilizes this in order to take advantage of parallel processing.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required