April 2018
Beginner to intermediate
282 pages
6h 52m
English
We need to import a KMeans method from the scikit-learn package and the rest of the code remains similar to the hierarchical clustering's code:
import pandas as pdimport numpy as npfrom sklearn import preprocessingimport matplotlib.pyplot as plt from sklearn.cluster import KMeansfrom sklearn.metrics import silhouette_samples, silhouette_scorehr_data = pd.read_csv('data/hr.csv', header=0)hr_data.head()hr_data = hr_data.dropna()print(hr_data.shape)print(list(hr_data.columns))data_trnsf = pd.get_dummies(hr_data, columns =['salary', 'sales'])data_trnsf.columns
We need to specify the number of clusters (n_clusters) in the k-means function to create a model. It is an essential parameter for creating k-means clusters. ...