5

Exploratory Data Analysis

The previous chapter covered the basic plotting principles using ggplot2, including the use of various geometries and themes layers. It turns out that cleaning and massaging the raw data (covered in Chapter 2 and Chapter 3) and visualizing the data (covered in Chapter 4) belong to the first stage of a typical data science project workflow – that is, exploratory data analysis (EDA). We will cover this using a few case studies in this chapter. We will learn how to apply the coding techniques we covered earlier in this book and focus on analyzing the data through the lens of EDA.

By the end of this chapter, you will know how to uncover the structures of data using numerical and graphical techniques, discover interesting ...

Get The Statistics and Machine Learning with R Workshop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.