29 Introduction to Machine Learning: k-Nearest Neighbours (kNN)

Peter McQuire

29.1 Introduction

Machine learning is a vast, diverse, and rapidly evolving subject. The objective of this chapter is to provide an introduction to machine learning by discussing one particular example, k-nearest neighbours, or “kNN”. We will look at two examples of kNN – an artificially simple one followed by a more realistic example.

Given the complexity of the subject, it is beyond the scope of this book to cover the important area of machine learning in further depth; the reader is encouraged to refer to the list of Recommended Reading at the end of the chapter.

kNN is a relatively simple machine learning tool, thus an excellent one with which to start your journey. Having said that, it remains one of the most frequently used machine learning algorithms. The application of kNN is widespread, used in the fields of investments, insurance, healthcare, and image recognition, to name but a few. kNN is a supervised machine learning algorithm (as opposed to an unsupervised machine learning algorithm, such as “k-means”). By this we mean that we start with data which is assigned to a known particular class, and develop an algorithm (which we “supervise”) that can be used to allocate new data to the appropriate class.

For example, we may be interested in predicting how individuals may vote in a political election, given some information about that person e.g. age, salary. ...

Get R Programming for Actuarial Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.