April 2018
Beginner to intermediate
282 pages
6h 52m
English
When your dataset doesn't have a target variable, you can use clustering algorithms to explore it, based on different characteristics. These algorithms group examples together, so that each group will have examples as similar as possible to each other, but dissimilar to examples in other groups.
Since you mostly don't have labels when you are performing such analysis, there is a performance metric that you can use to examine the quality of the resulting separation found by the algorithm.
It is called the Silhouette Coefficient. The Silhouette Coefficient will help you to understand two things:
It will give you a value between 1 and -1, with ...