19
Exploratory Data Analysis
In Chapter 17, we discussed the challenges of using machine learning without premium or embedded capacity. One of the key pitfalls we highlighted was blindly applying automated machine learning (AutoML) solutions to a dataset, which often results in inaccurate models. To overcome this limitation, a critical step is to gain a deep understanding of the inherent characteristics of the dataset.
To accomplish this, this chapter introduces the concept of exploratory data analysis (EDA). This approach to analysis, pioneered by John Tukey, encourages statisticians to thoroughly explore the data and formulate hypotheses. By doing so, we can extract valuable information that ultimately enhances our understanding of the dataset ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access