- Navigate to Subtopic Data exploration in the Jupyter Notebook and run the cell containing df.describe() :
This computes various properties including the mean, standard deviation, minimum, and maximum for each column. This table gives a high-level idea of how everything is distributed. Note that we have taken the transform of the result by adding a .T to the output; this swaps the rows and columns. Going forward with the analysis, we will specify a set of columns to focus on.
- Run the cell where these "focus columns" are defined:
cols = ['RM', 'AGE', 'TAX', 'LSTAT', 'MEDV']
-
This subset of columns can ...