Skip to Content
Learn Python by Building Data Science Applications
book

Learn Python by Building Data Science Applications

by Philipp Kats, David Katz
August 2019
Beginner
482 pages
12h 56m
English
Packt Publishing
Content preview from Learn Python by Building Data Science Applications

Defining the scope of work to be done

Before we dive into the process of data cleaning, which might be very time-consuming, it is always useful to define the scope of work—which columns and rows we actually need to clean. For this chapter, let's restrict the scope to the lowest level of the hierarchy—specific battles (level=100—pages for events with no children). We can use the equality operator to generate a Boolean mask, and then use this mask to filter the dataset:

>>> battles = data[data.level == 100]  >>> battles.shape(147, 23)

There are many columns in the dataset—enough for pandas to omit the middle part when printing. As we'll be mostly focused on time, geolocation, names, and casualties of each side, let's define those columns of ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python for Data Science

Python for Data Science

Yuli Vasiliev
Introduction to Machine Learning with Python

Introduction to Machine Learning with Python

Andreas C. Müller, Sarah Guido

Publisher Resources

ISBN: 9781789535365Supplemental Content