July 2019
Beginner to intermediate
740 pages
16h 52m
English
Let's move on to the 3-cleaning_data.ipynb notebook for our discussion on data cleaning. We will begin by importing pandas and reading in the data/nyc_temperatures.csv file, which contains the maximum daily temperature (TMAX), minimum daily temperature (TMIN), and the average daily temperature (TAVG) from the LaGuardia airport station in New York City for October 2018:
>>> import pandas as pd>>> df = pd.read_csv('data/nyc_temperatures.csv')>>> df.head()
The data we retrieved from the API is in the long format; for our analysis, we want it in the wide format, but we will address that in the Pivoting DataFrames section later this chapter:
| attributes | datatype | date | station | value |
|---|---|---|---|---|
| 0 | H,,S, | TAVG | 2018-10-01T00:00:00 |
Read now
Unlock full access