Dealing with problematic data

Now, you should be more confident with the basics of the process and be ready to face datasets that are more problematic, since it is very common to have messy data in reality. Consequently, let's see what happens if the CSV file contains a header and some missing values and dates. For example, to make our example realistic, let's imagine the situation of a travel agency:

  1. According to the temperature of three popular destinations, they record whether the user picks the first, second, or third destination:
Date,Temperature_city_1,Temperature_city_2,Temperature_city_3,Which_destination20140910,80,32,40,120140911,100,50,36,220140912,102,55,46,120140912,60,20,35,320140914,60,,32,320140914,,57,42,2
  1. In this case, ...

Get Python Data Science Essentials - Third Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.