Distance metrics
So, before we get into the algorithms, I want to address some mathematical esotericism. When you have two points, or any number of points, in a 2D space, it's fairly easy to conceptualize. It's basically calculating the hypotenuse along some right triangle, in terms of measuring the distance. However, what happens when you have a really high dimensional space? That's what we're going to get into, and we have a lot of clustering problems.
So, the most common distance metric is the Euclidean distance. This is essentially a generalization of the 2D approach. It's the square root of the sum of squared differences between two vectors, and it can be used in any dimensional space. There's a lot of others that we're not going to ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access