June 2016
Beginner to intermediate
304 pages
6h 24m
English
So far, we built different clustering algorithms but didn't measure their performances. In supervised learning, we just compare the predicted values with the original labels to compute their accuracy. In unsupervised learning, we don't have any labels. Therefore, we need a way to measure the performance of our algorithms.
A good way to measure a clustering algorithm is by seeing how well the clusters are separated. Are the clusters well separated? Are the datapoints in a cluster tight enough? We need a metric that can quantify this behavior. We will use a metric, called Silhouette Coefficient score. This score is defined for each datapoint. This coefficient is defined as follows:
score = (x – ...