In this chapter, we went through a step-by-step process from data to a holistic view of business, from which we processed a large amount of data on Spark and then built a model to produce a holistic view of the sales team's success for the IFS company.

Specifically, we first selected models as per business needs after we prepared Spark computing and loaded in preprocessed data. Second, we prepared and reduced features. Third, we estimated model coefficients. Fourth, we evaluated the estimated models. Then, we interpreted the analytical results. And finally, we deployed our estimated models.

The preceding process is similar to the process of working with small data. However, in dealing with big data, we need parallel computing, for which ...

Get Apache Spark Machine Learning Blueprints now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.