Chapter 1. Introducing Data Observability
Once upon a time, there was a young data analyst named Alex who had a deep passion for data. Alex loved the way data could help businesses make informed decisions, drive growth, and achieve success. However, Alex was also aware of the dangers of misinterpreting data or not having enough visibility into the data.
Alex was working on a critical project with a data engineer named Sarah. Sarah was responsible for preparing the data and making sure it was ready for analysis. As they delved deeper into the project, Alex and Sarah realized that there were many variables at play, and the data they were working with was not as straightforward as they had initially thought.
One day, while Alex was iterating on his analysis to generate insights, it appeared to him that the results presented on that day were looking odd and hard to relate to what he had seen to that point. He went to Sarah to discuss the case, but Sarah needed more context on his previous interpretation was or what his expectations were, and what he was asking her to check.
After four days of collaboration, pair review, and several brainstorming sessions, Sarah and Alex discovered that subtle changes in the distribution of half a dozen variables of the incoming data shifted the generated insights, several transformation steps later. Some variables had more missing values, hence they were dropped in the cleaning transformation, others had their average value increased greatly, and ...