Chapter 4 Exploratory Data Analysis

Once the laborious task of data munging is complete, the next step in the machine learning process is to become intimately familiar with the data set by performing what’s called Exploratory Data Analysis (EDA). The way to gain this level of familiarity is to utilize the many features of the R statistical environment that support this effort: numeric summaries, aggregations, distributions, densities, review of all the levels of factor variables, application of general statistical methods, exploratory plots, expository plots, and much more. It is always a good idea to explore a data set with multiple ...

Get Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.