July 2017
Beginner to intermediate
715 pages
17h 3m
English
Clustering can be seen as a method for feature engineering, and the results of clustering can be added to a supervised model as a set of additional features.
The simplest way of doing it to use one-hot-encoding of clustering results is as follows:
It looks very simple in code:
KMeans km = new KMeans(X, k, maxIter, runs); int[] labels = km.getClusterLabel(); SparseDataset sparse = new SparseDataset(k); for (int i = 0; i < labels.length; i++) { sparse.set(i, ...