January 2019
Intermediate to advanced
378 pages
8h 27m
English
Another method of modeling that tells us which features have an impact on our model is the feature importance that comes out of a random forest classifier. This more accurately reflects the true impact of a given feature.
Let's run our data through this type of model and examine the results:
from sklearn.ensemble import RandomForestClassifier clf_rf = RandomForestClassifier(n_estimators=1000) clf_rf.fit(X_train, y_train) f_importances = clf_rf.feature_importances_ f_names = X_train.columns f_std = np.std([tree.feature_importances_ for tree in clf_rf.estimators_], axis=0) zz = zip(f_importances, f_names, f_std) zzs = sorted(zz, key=lambda x: x[0], reverse=True) n_features = 10 imps = [x[0] for x in zzs[:n_features]] ...
Read now
Unlock full access