Chapter 4. Exploratory Data Analysis

Exploratory Data Analysis (EDA) performed in commercial settings is generally commissioned as part of a larger piece of work that is organized and executed along the lines of a feasibility assessment. The aim of this feasibility assessment, and thus the focus of what we can term an extended EDA, is to answer a broad set of questions about whether the data examined is fit for purpose and thus worthy of further investment.

Under this general remit, the data investigations are expected to cover several aspects of feasibility that include the practical aspects of using the data in production, such as its timeliness, quality, complexity, and coverage, as well as being appropriate for the intended hypothesis to be ...

Get Mastering Spark for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.