AppendixData Summarization and Visualization

  1. Part 1 Summarization 1: Building Blocks of Data Analysis
  2. Part 2 Visualization: Graphs and Tables for Summarizing and Organizing Data
  3. Part 3 Summarization 2: Measures of Center, Variability, and Position
  4. Part 4 Summarization and Visualization of Bivariate Relationships

Here, we present a very brief review of methods for summarizing and visualizing data. For deeper coverage, please see Discovering Statistics, by Daniel Larose (second edition, W.H. Freeman, New York, 2013).

Part 1 Summarization 1: Building Blocks of Data Analysis

  • Descriptive statistics refers to methods for summarizing and organizing the information in a data set. Consider Table A.1, which we will use to illustrate some statistical concepts.

    Table A.1 Characteristics of 10 loan applicants

    Applicant Marital Status Mortgage Income ($) Income Rank Year Risk
    1 Single y 38,000 2 2009 Good
    2 Married y 32,000 7 2010 Good
    3 Other n 25,000 9 2011 Good
    4 Other n 36,000 3 2009 Good
    5 Other y 33,000 4 2010 Good
    6 Other n 24,000 10  2008 Bad
    7 Married y 25,100 8 2010 Good
    8 Married y 48,000 1 2007 Good
    9 Married y 32,100 6 2009 Bad
    10 Married y 32,200 5 2010 Good
  • The entities for which information is collected are called the elements. In Table A.1, the elements are the 10 applicants. Elements are also called cases or subjects.
  • A variable is a characteristic of an element, which takes on different values for different elements. The variables in ...

Get Discovering Knowledge in Data: An Introduction to Data Mining, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.