June 2017
Beginner to intermediate
576 pages
15h 22m
English
An understanding of how data is generated in your domain as well as how it has been generated in the past is important. This means understanding how data has been transformed from its original raw state to how it is consumed by the ultimate user of the data. Understanding the imperfections of the current process will enable you to get a sense of which data can be consistently relied upon. Additionally, try to gather a history of what data changes and methodologies have been attempted in the past. This is important since the different ways that data used might have been quite different, and the goals might have been different. Knowing what has succeeded and what has failed in the past will prevent you from reinventing ...