July 2017
Intermediate to advanced
360 pages
8h 26m
English
A complementary requirement is that each sample belonging to a class is assigned to the same cluster. This measure can be determined using the conditional entropy H(K|C), which is the uncertainty in determining the right cluster given the knowledge of the class. Like for the homogeneity score, we need to normalize this using the entropy H(K):
We can compute this score (on the same dataset) using the function completeness_score():
from sklearn.metrics import completeness_score>>> km = KMeans(n_clusters=4)>>> Yp = km.fit_predict(X)>>> print(completeness_score(Y, Yp))0.807166746307
Also, in this case, the value is rather high, ...
Read now
Unlock full access