Chapter 7

Modeling with Instances

IN THIS CHAPTER

check Overviewing instance-based learning

check Distinguishing classification from cluster methods

check Working through average nearest neighbor algorithms

check Mastering the core features and characteristics of k-nearest neighbor algorithms

check Seeing how nearest neighbor algorithms can be used in the retail sector

Data scientists use classification methods to help them build predictive models that they can then use to forecast the classification of future observations. Classification is a form of supervised machine learning: The classification algorithm learns from labeled data. Data labels make it easier for your models to make decisions based on the logic rules you’ve defined. Your plain-vanilla clustering algorithm, like the k-means method, can help you predict subgroups from within unlabeled datasets. But there’s more to life than plain vanilla. I think it’s about time to take things one step further, by exploring the instance-based family of machine ...

Get Data Science For Dummies, 2nd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.