April 2018
Beginner to intermediate
282 pages
6h 52m
English
In this method, a statistical test is applied to each feature individually. We retain only the best features according to the test outcome scores.
The following example illustrates the chi-squared statistical test to select the best features from the HR attrition dataset:
#Chi2 Selectorfrom sklearn.feature_selection import SelectKBestfrom sklearn.feature_selection import chi2chi2_model = SelectKBest(score_func=chi2, k=4)X_best_feat = chi2_model.fit_transform(X, Y)# selected features
print('Number of features:', X.shape[1])print('Reduced number of features:',X_best_feat.shape[1])
We can see from the following output that 4 best features were selected. We can change the number of best features to be considered by ...