The real world is messy. Recognizing this mess will differentiate a sophisticated and useful analysis from one that is hopelessly naive. This is especially true for highly complicated models, where it becomes tempting to confuse signal with noise and hence “overfit.” The ability to deal with this mess and noise is the most important skill you need to learn to keep from embarrassing yourself as you work and learn with data.

In any analysis, you have targeted unknowns and untargeted unknowns. The former are built into the model as parameters to be estimated, and you use these estimates explicitly in your decision-making processes. ...

Get Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.