October 2018
Intermediate to advanced
172 pages
4h 6m
English
Scaling is the process of standardizing your data so that the values under every feature fall within a certain range, such as -1 to +1. In order to scale the data, we subtract each value of a particular feature with the mean of that feature, and divide it by the variance of that feature. In order to scale the features in our fraud detection dataset, we use the following code:
from sklearn.preprocessing import StandardScaler#Setting up the standard scaler scale_data = StandardScaler()#Scaling the datascale_data.fit(df)df_scaled = scale_data.transform(df)#Applying the K-Means algorithm on the scaled data#Initializing K-means with 2 clustersk_means = KMeans(n_clusters = 2)#Fitting the model on the datak_means.fit(df_scaled)# Inertia ...
Read now
Unlock full access