AppendixData Summarization and Visualization

  1. Part 1 Summarization 1: Building Blocks of Data Analysis
  2. Part 2 Visualization: Graphs and Tables for Summarizing and Organizing Data
  3. Part 3 Summarization 2: Measures of Center, Variability, and Position
  4. Part 4 Summarization and Visualization of Bivariate Relationships

Here, we present a very brief review of methods for summarizing and visualizing data. For deeper coverage, please see Discovering Statistics, by Daniel Larose (second edition, W.H. Freeman, New York, 2013).

Part 1 Summarization 1: Building Blocks of Data Analysis

  • Descriptive statistics refers to methods for summarizing and organizing the information in a data set. Consider Table A.1, which we will use to illustrate some statistical concepts.

    Table A.1 Characteristics of 10 loan applicants

    Applicant Marital Status Mortgage Income ($) Income Rank Year Risk
    1 Single y 38,000 2 2009 Good
    2 Married y 32,000 7 2010 Good
    3 Other n 25,000 9 2011 Good
    4 Other n 36,000 3 2009 Good
    5 Other y 33,000 4 2010 Good
    6 Other n 24,000 10  2008 Bad
    7 Married y 25,100 8 2010 Good
    8 Married y 48,000 1 2007 Good
    9 Married y 32,100 6 2009 Bad
    10 Married y 32,200 5 2010 Good
  • The entities for which information is collected are called the elements. In Table A.1, the elements are the 10 applicants. Elements are also called cases or subjects.
  • A variable is a characteristic of an element, which takes on different values for different elements. The variables in ...

Get Discovering Knowledge in Data: An Introduction to Data Mining, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.