Skip to Content
Learn Python by Building Data Science Applications
book

Learn Python by Building Data Science Applications

by Philipp Kats, David Katz
August 2019
Beginner
482 pages
12h 56m
English
Packt Publishing
Content preview from Learn Python by Building Data Science Applications

Initial exploration

Before anything else, we need to take a look at the data itself, as well as its columns and rows. It's reasonable to start data exploration by understanding the following:

  1. How do specific values look like, for example, using df.head(N), df.tail(N) , or df.sample(N) to retrieve (and print) the first N, last N, or random N rows from the dataset? As regards heads and tails, by default, N = 5. For our sample, it is 1 (one row). Alternatively, the sample method can take a frac argument, which will return a fraction of records—for example, df.sample(frac=0.25) will return 25% of the initial dataset. Note that printing will omit some columns in the middle if there are too many of them.
  2. The overall shape of the dataset—the number ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python for Data Science

Python for Data Science

Yuli Vasiliev
Introduction to Machine Learning with Python

Introduction to Machine Learning with Python

Andreas C. Müller, Sarah Guido

Publisher Resources

ISBN: 9781789535365Supplemental Content