Chapter 10. From Machine Learning to Artificial Intelligence
Statistics at the Start
Machine-learning methods have changed rapidly in the past several years, but a larger trend began about a decade ago. Specifically, the field of data science emerged and we experienced an evolution from statisticians to computer engineers and algorithms (see Figure 10-1).
Classical statistics was the domain of mathematics and normal distributions. Modern data science is infinitely flexible on the method or properties, as long as it uncovers a predictable outcome. The classical approach involved a unique way to solve a problem. But new approaches vary drastically, with multiple solution paths.
To set context, let’s review a standard analytics and split a dataset into two parts, one for building the model, and one for testing it, aiming for a model without overfitting the data. Overfitting can occur when assumptions from the build set do not apply in general.
For example, as a paint company seeking homeowners that might be getting ready to repaint their houses, the test set may indicate the following:
Name | Painted housewithin 12 months |
---|---|
Sam | Yes |
Ian | No |
Understandably, you cannot generalize on this property. But you could look at income pattern, data regarding the house purchase, and recently filed renovation ...
Get The Path to Predictive Analytics and Machine Learning now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.