10. Manual Feature Engineering: Manipulating Data for Fun and Profit

In [1]:

# setup
from mlwpy import *
%matplotlib inline

iris = datasets.load_iris()
(iris_train,     iris_test,
 iris_train_tgt, iris_test_tgt) = skms.train_test_split(iris.data,
                                                        iris.target,
                                                        test_size=.25)
# remove units ' (cm)' from names
iris.feature_names = [fn[:-5] for fn in iris.feature_names]

# dataframe for convenience
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)
iris_df['species'] = iris.target_names[iris.target]

10.1 Feature Engineering Terminology and Motivation

We are going to turn our attention away from expanding our catalog of models and instead take a closer look at the data. Feature engineering refers to manipulation—addition, ...

Get Machine Learning with Python for Everyone now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.