Get full access to Training Systems Using Python Statistical Modeling and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.

Start your free trial

The silhouette method

The silhouette method requires that we first compute silhouette scores for each data point. The silhouette score for a single data point is the average dissimilarity of the data point with all other data points in the next-nearest cluster, minus the average dissimilarity of the data point to points in the same cluster, divided by the larger of these two numbers. This is represented using the following formula:

We plot the average silhouette score for all data points in the dataset. A large silhouette score, close to 1, means that the data point is dissimilar to data points in other clusters, and the data point seems to ...

Get Training Systems Using Python Statistical Modeling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Get it now

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

Start your free trial Become a member now