19

Exploratory Data Analysis

In Chapter 17, we discussed the challenges of using machine learning without premium or embedded capacity. One of the key pitfalls we highlighted was blindly applying automated machine learning (AutoML) solutions to a dataset, which often results in inaccurate models. To overcome this limitation, a critical step is to gain a deep understanding of the inherent characteristics of the dataset.

To accomplish this, this chapter introduces the concept of exploratory data analysis (EDA). This approach to analysis, pioneered by John Tukey, encourages statisticians to thoroughly explore the data and formulate hypotheses. By doing so, we can extract valuable information that ultimately enhances our understanding of the dataset ...

Get Extending Power BI with Python and R - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.