March 2018
Intermediate to advanced
364 pages
7h 12m
English
The K-Nearest Neighbors classifier (KNN) is one of the simplest yet most commonly used classifiers in supervised machine learning. KNN is often considered a lazy learner; it doesn’t technically train a model to make predictions. Instead an observation is predicted to be the class of that of the largest proportion of the k nearest observations. For example, if an observation with an unknown class is surrounded by an observation of class 1, then the observation is classified as class 1. In this chapter we will explore how to use scikit-learn to create and use a KNN classifier.
You need to find an observation’s k nearest observations (neighbors).
Use scikit-learn’s NearestNeighbors:
# Load librariesfromsklearnimportdatasetsfromsklearn.neighborsimportNearestNeighborsfromsklearn.preprocessingimportStandardScaler# Load datairis=datasets.load_iris()features=iris.data# Create standardizerstandardizer=StandardScaler()# Standardize featuresfeatures_standardized=standardizer.fit_transform(features)# Two nearest neighborsnearest_neighbors=NearestNeighbors(n_neighbors=2).fit(features_standardized)# Create an observationnew_observation=[1,1,1,1]# Find distances and indices of the observation's nearest neighborsdistances,indices=nearest_neighbors.kneighbors([new_observation])# View the nearest neighborsfeatures_standardized[indices ...