Chapter 4

Validating Machine Learning

IN THIS CHAPTER

check Explaining how correct sampling is critical in machine learning

check Highlighting errors dictated by bias and variance

check Proposing different approaches to validation and testing

check Warning against biased samples, overfitting, underfitting, and snooping

“I’m not running around looking for love and validation …”

— SOPHIE B. HAWKINS

Having examples (in the form of data sets) and a machine learning algorithm at hand doesn’t assure that solving a learning problem is possible or that the results will provide the desired solution. For example, if you want your computer to distinguish a photo of a dog from a photo of a cat, you can provide it with good examples of dogs and cats. You then train a dog versus cat classifier based on some machine learning algorithm that could output the probability that a given photo is a dog or a cat. Of course, the output is a probability — not an absolute assurance that the photo is a dog or cat.

Based on the probability that the classifier reports, you can decide the class (dog or cat) of a photo based on the ...

Get Coding All-in-One For Dummies now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.