January 2020
Beginner to intermediate
372 pages
10h
English
To perform k-means discretization, we used KBinsDiscretizer() from scikit-learn, setting strategy to kmeans and the number of clusters to 10 in the n_bins argument. With the fit() method, the transformer learned the cluster boundaries using the k-means algorithm. With the transform() method, the discretizer sorted the values of the variable to their corresponding cluster, returning a NumPy array with the discretized variables, which we converted into a dataframe.
Read now
Unlock full access