Skip to Main Content
Hands-On Predictive Analytics with Python
book

Hands-On Predictive Analytics with Python

by Alvaro Fuentes
December 2018
Beginner to intermediate content levelBeginner to intermediate
330 pages
8h 32m
English
Packt Publishing
Content preview from Hands-On Predictive Analytics with Python

Univariate EDA for categorical features

For categorical features, EDA is actually easier, as features have a limited number of categories. The first thing we would like to know is the number that we have in every category. It is almost always useful to express this as a percentage or proportion of the total count.

On the other hand, just as the histogram is the default visualization for a numerical feature, the barplot is the default way to visualize the distribution of a categorical feature. pandas makes this very easy. Since we have only three categorical features, we won't create a function like the one we created for numerical features.

Let's take a look at the cut feature:

feature = categorical_features[0]count = diamonds[feature].value_counts() ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python: Advanced Predictive Analytics

Python: Advanced Predictive Analytics

Ashish Kumar, Joseph Babcock
Python: Data Analytics and Visualization

Python: Data Analytics and Visualization

Phuong Vo.T.H, Martin Czygan, Ashish Kumar, Kirthi Raman
Python: End-to-end Data Analysis

Python: End-to-end Data Analysis

Phuong Vothihong, Martin Czygan, Ivan Idris, Magnus Vilhelm Persson, Luiz Felipe Martins

Publisher Resources

ISBN: 9781789138719Supplemental Content