7
Identifying and Fixing Missing Values
I think I speak for many data analysts and scientists when I write, rarely is there something so seemingly small and trivial that is of as much consequence as a missing value. We spend a good deal of our time worrying about missing values because they can have a dramatic, and surprising, effect on our analysis. This is most likely to happen when missing values are not random, but are correlated with a dependent variable. For example, if we are doing a longitudinal study of earnings, but individuals with lower education are more likely to skip the earnings question each year, there is a decent chance that this will bias our parameter estimate for education.
Of course, identifying missing values is not even ...
Get Python Data Cleaning Cookbook - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.