Introduction to datasets

We will be using a real dataset from zillow.com, an online real estate marketplace that releases house price datasets as part of their research efforts. These datasets are available in the public domain and are free to use after proper attribution to zillow.com. We will be using the latest data on mean house prices of US regions, available at https://www.zillow.com/research/data/. It is a CSV dataset, or a text file with CSVs. Let's start by importing the pandas modules into our Jupyter Notebook, as follows:

import pandas as pd 

We will then read in our dataset. Since it's a CSV file, we are using pandas' read_csv method for this. We pass the file name, with a comma as a separator, to the read_csv method, and we create ...

Get Mastering Exploratory Analysis with pandas now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.