ChapterÂ 7.Â Will the Bad Data Please Stand Up?
Among hikers and climbers, they say that âthere is no such thing as bad weatherâonly inappropriate clothing.â And as anybody who has spent some time outdoors can attest, it is often precisely trips undertaken under more challenging circumstances that lead to the most noteworthy memories. But one has to be willing to put oneself out there.
In a similar spirit, I donât think there is really such a thing as âbad dataââonly inappropriate approaches. To be sure, there are datasets that require more work (because of missing data, background noise, poor encoding, inconvenient file formats, and so on), but they donât pose fundamental challenges. Given sufficient effort, these problems can be overcome, and there are useful techniques for handling such situations (like tricks for staying warm during a late-November hike).
But basically, thatâs remaining within familiar territory. To discover new vistas, one has to be willing to follow an unmarked trail and see where it leads. Or equivalently, when working with data, one has to dare to have an opinion about where the data is leading and then check whether one was right about it. Note that this takes courage: it is far safer to merely describe what one sees, but doing so is missing a whole lot of action.
Letâs evaluate some trail reports. Later, weâll regroup and see what lessons we have learned.
Example 1: Defect Reduction in Manufacturing
A manufacturing company ...