May 2022
Beginner to intermediate
330 pages
7h 19m
English
The ability to quickly assess the shortcomings of data and correct them can be the difference between being able to accomplish what you need to on time or falling behind. In this chapter, we're going to give you the tools to identify some of these problems, which you'll find are present in much of the data found in the industry.
We'll first look at when there can be too much data. This can be an issue where features can have an extremely high correlation with one another and in turn complicate a model. You'll see how to find this information and then remove the offending entries.
After that, we'll check into ways to get rid of blank, empty, or Not a Number (NaN) data that muddy the waters. This problem ...
Read now
Unlock full access