How to do it...

  1. Read in the diamonds dataset, and output the first five rows:
>>> diamonds = pd.read_csv('data/diamonds.csv')>>> diamonds.head()
  1. Before we begin analysis, let's change the cut, color, and clarity columns into ordered categorical variables:
>>> cut_cats = ['Fair', 'Good', 'Very Good', 'Premium', 'Ideal']>>> color_cats = ['J', 'I', 'H', 'G', 'F', 'E', 'D']>>> clarity_cats = ['I1', 'SI2', 'SI1', 'VS2',                    'VS1', 'VVS2', 'VVS1', 'IF']>>> diamonds['cut'] = pd.Categorical(diamonds['cut'],                                     categories=cut_cats,                                      ordered=True)>>> diamonds['color'] = pd.Categorical(diamonds['color'],                                       categories=color_cats,                                        ordered=True)>>> diamonds['clarity'] ...

Get Pandas Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.