- Read in the employee dataset, and create a DatetimeIndex with the HIRE_DATE column:
>>> employee = pd.read_csv('data/employee.csv', parse_dates=['JOB_DATE', 'HIRE_DATE'], index_col='HIRE_DATE')>>> employee.head()
- Let's first do a simple grouping by just gender, and find the average salary for each:
>>> employee.groupby('GENDER')['BASE_SALARY'].mean().round(-2)GENDER Female 52200.0 Male 57400.0 Name: BASE_SALARY, dtype: float64
- Let's find the average salary based on hire date, and group everyone into 10-year buckets:
>>> employee.resample('10AS')['BASE_SALARY'].mean().round(-2)HIRE_DATE 1958-01-01 81200.0 1968-01-01 ...