This first chapter of Part III also serves as a bridge between Part II, univariate analysis, and Part III, multivariate analysis. In this chapter, we will look at how we can use clustering techniques to solve the problem we ended Chapter 7 with: we can intuitively understand that in a dataset with values { 1, 2, 3, 50, 97, 98, 99 }, the value 50 is an outlier but our outlier detection engine steadfastly refuses to believe it.
This chapter will begin with a description of the concept of clustering. Then, we will look ...