Chapter 3. K-Nearest Neighbors Classification

You probably know someone who really likes a certain brand, such as a particular technology company or clothing manufacturer. Usually you can detect this by what the person wears, talks about, and interacts with. But what are some other ways we could determine brand affinity?

For an ecommerce site, we could identify brand loyalty by looking at previous orders of similar users to see what they’ve bought. So, for instance, let’s assume that a user has a history of orders, each including two items, as shown in Figure 3-1.

user order history
Figure 3-1. User with a history of orders of multiple brands

Based on his previous orders, we can see that this user buys a lot of Milan Clothing Supplies (not a real brand, but you get the picture). Out of the last five orders, he has bought five Milan Clothing Supplies shirts. Thus, we could say he has a certain affinity toward this company. Knowing this, if we pose the question of what brand this user is particularly interested in, Milan Clothing Supplies would be at the top.

This general idea is known as the K-Nearest Neighbors (KNN) classification algorithm. In our case, K equals 5, and each order represents a vote on a brand. Whatever brand gets the highest vote is our classification. This chapter will introduce and define the KNN classification as well as work through a code example that detects whether a face ...

Get Thoughtful Machine Learning now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.