Chapter 7: Data Exploration and Summary Statistics

Overview

Summarizing Continuous Variables

Descriptive Statistics

Histograms

Percentiles

Correlations

Summarizing Categorical Variables

Distinct Counts

Frequency

Top K

Cross Tabulations

Variable Transformation and Dimension Reduction

Variable Binning

Variable Imputation

Conclusion

Overview

The description of the columns in a table using tabular or visual outputs is typically the first step in a data analysis or a statistical modeling process. In this chapter, you learn how to use CAS to explore and summarize data. Topics include summarizing continuous variables and categorical variables, data transformation, dimensional reduction, and related visualizations using the Python Bokeh package.

Let’s ...

Get SAS Viya now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.