Model building
A model is a representation of things, a rendering or description of reality. Just like a model of a physical building, data science models attempt to make sense of the reality; in this case, the reality is the underlying relationships between the features and the predicted variable. They may not be 100 percent accurate, but still very useful to give some deep insights into our business space based on the data.
There are several machine learning algorithms that help us model data and Spark provides many of them out of the box. However, which model to build is still a million dollar question. It depends on various factors, such as interpretability-accuracy trade-off, how much data you have at hand, categorical or numerical variables, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access