November 2017
Intermediate to advanced
374 pages
10h 19m
English
Import the California housing dataset and the libraries we have been using: numpy, pandas, and matplotlib. It is a medium-sized dataset but is large relative to the other scikit-learn datasets:
from __future__ import divisionimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt from sklearn.datasets import fetch_california_housing #From within an ipython notebook%matplotlib inline cali_housing = fetch_california_housing() X = cali_housing.datay = cali_housing.target
Bin the target variable to increase the balance in splitting the dataset in regards to the target:
bins = np.arange(6)binned_y = np.digitize(y, bins)
Split the dataset X and y into three sets. X_1 and X_stack refer to the input variables of the first ...
Read now
Unlock full access