7 SOURCING
One of the key ingredients of data science is, as the name applies, data. As we have discussed in the previous chapter, the data used in any project can be the difference between a good outcome and a bad one. To be clear, a good outcome isn’t just proving a hypothesis. Failing to prove a hypothesis can be equally valuable, potentially avoiding making a poor decision based on a false premise.
Sourcing data isn’t just about finding some data, it is about ensuring the data set contains the data items needed and has the right data properties to allow the analysis. In some instances, the source data might need to be manipulated to derive data. In Chapter 8 we have an example of this in data granularity, where daily data is used to calculate ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access