June 2018
Intermediate to advanced
248 pages
5h 27m
English
In this section, you will explore and perform quality checks on the dataset. You will check what your data shape is, as well as its data types, any missing/NaN values, how many feature columns you have, and what each column represents. Let's start by loading the data and exploring it:
In [30]: from sklearn.datasets import load_boston dataset = load_boston() samples,label, feature_names = dataset.data , dataset.target , dataset.feature_namesIn [31]: samples.shapeOut[31]: (506, 13)In [32]: label.shapeOut[32]: (506,)In [33]: feature_namesOut[33]: array(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT'], dtype='<U7')
In the preceding code, you load the dataset and parse the ...