Data science today is a lot like the Wild West: there’s endless opportunity and excitement, but also a lot of chaos and confusion. If you’re new to data science and applied machine learning, evaluating a machine-learning model can seem pretty overwhelming. Now you have help. With this O’Reilly report, machine-learning expert Alice Zheng takes you through the model evaluation basics.
In this overview, Zheng first introduces the machine-learning workflow, and then dives into evaluation metrics and model selection. The latter half of the report focuses on hyperparameter tuning and A/B testing, which may benefit more seasoned machine-learning practitioners.
With this report, you will:
- Learn the stages involved when developing a machine-learning model for use in a software application
- Understand the metrics used for supervised learning models, including classification, regression, and ranking
- Walk through evaluation mechanisms, such as hold?out validation, cross-validation, and bootstrapping
- Explore hyperparameter tuning in detail, and discover why it’s so difficult
- Learn the pitfalls of A/B testing, and examine a promising alternative: multi-armed bandits
- Get suggestions for further reading, as well as useful software packages
Table of contents
- 1. Orientation
2. Evaluation Metrics
- Classification Metrics
- Ranking Metrics
- Regression Metrics
- Caution: The Difference Between Training Metrics and Evaluation Metrics
- Caution: Skewed Datasets—Imbalanced Classes, Outliers, and Rare Data
- Related Reading
- Software Packages
- 3. Offline Evaluation Mechanisms: Hold-Out Validation, Cross-Validation, and Bootstrapping
- 4. Hyperparameter Tuning
5. The Pitfalls of A/B Testing
- A/B Testing: What Is It?
Pitfalls of A/B Testing
- 1. Complete Separation of Experiences
- 2. Which Metric?
- 3. How Much Change Counts as Real Change?
- 4. One-Sided or Two-Sided Test?
- 5. How Many False Positives Are You Willing to Tolerate?
- 6. How Many Observations Do You Need?
- 7. Is the Distribution of the Metric Gaussian?
- 8. Are the Variances Equal?
- 9. What Does the p-Value Mean?
- 10. Multiple Models, Multiple Hypotheses
- 11. How Long to Run the Test?
- 12. Catching Distribution Drift
- Multi-Armed Bandits: An Alternative
- Related Reading
- That’s All, Folks!
- Title: Evaluating Machine Learning Models
- Release date: September 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491932445
You might also like
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
Practical Time Series Analysis
Time series data analysis is increasingly important due to the massive production of such data through …
Data Science for Business
Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces …