Chapter 21

Ten (or So) Questions Answered by Exploratory Data Analysis (EDA)

In This Chapter

arrow Understanding the most important questions answered by Exploratory Data Analysis (EDA)

arrow Seeing how to use EDA to determine if a dataset conforms to your assumptions

This chapter covers ten key questions about a dataset that can be answered by using exploratory data analysis (EDA). These questions focus on the statistical properties of the data, along with the distribution followed by the data and the nature of the relationships among the variables in the data.

What Are the Key Properties of a Dataset?

Prior to performing any type of statistical analysis, understanding the nature of the data being analyzed is essential. You can use EDA to identify the properties of a dataset to determine the most appropriate statistical methods to apply to the data. You can investigate several types of properties with EDA techniques, including the following:

  • The center of the data
  • The spread among the members of the data
  • The skewness of the data
  • The probability distribution the data follows
  • The correlation among the elements in the dataset
  • Whether or not the parameters of the data are constant over time
  • The presence of outliers in the data

Chapter 5 introduces most of these notions. Chapter 16 talks ...

Get Statistics for Big Data For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.