O'Reilly logo

Practical Data Science Cookbook - Second Edition by Abhijit Dasgupta, Benjamin Bengfort, Sean Patrick Murphy, Tony Ojeda, Prabhanjan Tattar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

How to do it...

We will dive into the analysis stage with the following steps:

  1. Let's start by looking at whether there is an overall trend of how mpg changes over time on average. We first want to group the data by year:
In [17]: grouped = vehicles.groupby("year") 
  1. Next, we want to compute the mean of three separate columns by the previous grouping:
In [18]: averaged = grouped['comb08', 'highway08','city08'].agg([np.mean]) 

This produces a new data frame with three columns containing the mean of comb08, highway08, and city08 variables, respectively. Notice that we are using the mean function supplied by NumPy (np).

  1. To make life easier, we will rename the columns and then create a new column named year, which contains the data frame's ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required