- Now that we know how to select plotting elements and change their attributes, let's actually create a data visualization. Let's read in the movie dataset, calculate the median budget for each year, and then find the five year rolling average to smooth the data:
>>> movie = pd.read_csv('data/movie.csv')>>> med_budget = movie.groupby('title_year')['budget'].median() / 1e6>>> med_budget_roll = med_budget.rolling(5, min_periods=1).mean()>>> med_budget_roll.tail()title_year 2012.0 20.893 2013.0 19.893 2014.0 19.100 2015.0 17.980 2016.0 17.780 Name: budget, dtype: float64
- Let's get our data into NumPy arrays:
>>> years = med_budget_roll.index.values>>> years[-5:]array([ 2012., 2013., 2014., 2015., 2016.])>>> budget = med_budget_roll.values ...