How it works...

In this recipe, we used pandas hist() to plot the distribution of all the numerical variables in the Boston House Prices dataset from scikit-learn. To load the data, we imported the dataset from scikit-learn datasets and then used load_boston() to load the data. Next, we captured the data into a dataframe using pandas DataFrame(), indicating that the data is stored in the data attribute and the variable names in the feature_names attribute.

To display the histograms of all the numerical variables, we used pandas hist(), which calls matplotlib.pyplot.hist() on each variable in the dataframe, resulting in one histogram per variable. We indicated the number of intervals for the histograms using the bins argument, adjusted the ...

Get Python Feature Engineering Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.