O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Data Analysis with IBM SPSS Statistics

Book Description

Master data management & analysis techniques with IBM SPSS Statistics 24

About This Book

  • Leverage the power of IBM SPSS Statistics to perform efficient statistical analysis of your data
  • Choose the right statistical technique to analyze different types of data and build efficient models from your data with ease
  • Overcome any hurdle that you might come across while learning the different SPSS Statistics concepts with clear instructions, tips and tricks

Who This Book Is For

This book is designed for analysts and researchers who need to work with data to discover meaningful patterns but do not have the time (or inclination) to become programmers. We assume a foundational understanding of statistics such as one would learn in a basic course or two on statistical techniques and methods.

What You Will Learn

  • Install and set up SPSS to create a working environment for analytics
  • Techniques for exploring data visually and statistically, assessing data quality and addressing issues related to missing data
  • How to import different kinds of data and work with it
  • Organize data for analytical purposes (create new data elements, sampling, weighting, subsetting, and restructure your data)
  • Discover basic relationships among data elements (bivariate data patterns, differences in means, correlations)
  • Explore multivariate relationships
  • Leverage the offerings to draw accurate insights from your research, and benefit your decision-making

In Detail

SPSS Statistics is a software package used for logical batched and non-batched statistical analysis. Analytical tools such as SPSS can readily provide even a novice user with an overwhelming amount of information and a broad range of options for analyzing patterns in the data.

The journey starts with installing and configuring SPSS Statistics for first use and exploring the data to understand its potential (as well as its limitations). Use the right statistical analysis technique such as regression, classification and more, and analyze your data in the best possible manner. Work with graphs and charts to visualize your findings. With this information in hand, the discovery of patterns within the data can be undertaken. Finally, the high level objective of developing predictive models that can be applied to other situations will be addressed.

By the end of this book, you will have a firm understanding of the various statistical analysis techniques offered by SPSS Statistics, and be able to master its use for data analysis with ease.

Style and approach

Provides a practical orientation to understanding a set of data and examining the key relationships among the data elements. Shows useful visualizations to enhance understanding and interpretation. Outlines a roadmap that focuses the process so decision regarding how to proceed can be made easily.

Table of Contents

  1. Preface
    1. What this book covers
    2. What you need for this book
    3. Who this book is for
    4. Conventions
    5. Reader feedback
    6. Customer support
      1. Downloading the example code
      2. Errata
      3. Piracy
      4. Questions
  2. Installing and Configuring SPSS
    1. The SPSS installation utility
      1. Installing Python for the scripting
    2. Licensing SPSS
      1. Confirming the options available
    3. Launching and using SPSS
    4. Setting parameters within the SPSS software
    5. Executing a basic SPSS session
    6. Summary
  3. Accessing and Organizing Data
    1. Accessing and organizing data overview
    2. Reading Excel files
    3. Reading delimited text data files
    4. Saving IBM SPSS Statistics files
    5. Reading IBM SPSS Statistics files
    6. Demo - first look at the data - frequencies
    7. Variable properties
      1. Variable properties - name
      2. Variable properties - type
      3. Variable properties - width
      4. Variable properties - decimals
      5. Variable properties - label
      6. Variable properties - values
      7. Variable properties - missing
      8. Variable properties - columns
      9. Variable properties - align
      10. Variable properties - measure
      11. Variable properties - role
      12. Demo - adding variable properties to the Variable View
      13. Demo - adding variable properties via syntax
      14. Demo - defining variable properties
    8. Summary
  4. Statistics for Individual Data Elements
    1. Getting the sample data
    2. Descriptive statistics for numeric fields
      1. Controlling the descriptives display order
      2. Frequency distributions
    3. Discovering coding issues using frequencies
      1. Using frequencies to verify missing data patterns
    4. Explore procedure
      1. Stem and leaf plot
      2. Boxplot
      3. Using explore to check subgroup patterns
    5. Summary
  5. Dealing with Missing Data and Outliers
    1. Outliers
      1. Frequencies for histogram and percentile values
      2. Descriptives for standardized scores
      3. The Examine procedure for extreme values and boxplot
      4. Detecting multivariate outliers
    2. Missing data
      1. Missing values in Frequencies
      2. Missing values in Descriptives
      3. Missing value patterns
      4. Replacing missing values
    3. Summary
  6. Visually Exploring the Data
    1. Graphs available in SPSS procedures
      1. Obtaining bar charts with frequencies
      2. Obtaining a histogram with frequencies
      3. Creating graphs using chart builder
      4. Building a scatterplot
      5. Create a boxplot using chart builder
    2. Summary
  7. Sampling, Subsetting, and Weighting
    1. Select cases dialog box
      1. Select cases - If condition is satisfied
        1. Example
        2. If condition is satisfied combined with Filter
        3. If condition is satisfied combined with Copy
        4. If condition is satisfied combined with Delete unselected cases
      2. The Temporary command
      3. Select cases based on time or case range
      4. Using the filter variable
    2. Selecting a random sample of cases
    3. Split File
    4. Weighting
    5. Summary
  8. Creating New Data Elements
    1. Transforming fields in SPSS
    2. The RECODE command
      1. Creating a dummy variable using RECODE
        1. Using RECODE to rescale a field
        2. Respondent's income using the midpoint of a selected category
    3. The COMPUTE command
    4. The IF command
    5. The DO IF/ELSE IF command
    6. General points regarding SPSS transformation commands
    7. Summary
  9. Adding and Matching Files
    1. SPSS Statistics commands to merge files
    2. Example of one-to-many merge - Northwind database
      1. Customer table
      2. Orders table
      3. The Customer-Orders relationship
      4. SPSS code for a one-to-many merge
      5. Alternate SPSS code
    3. One-to-one merge - two data subsets from GSS2016
    4. Example of combining cases using ADD FILES
    5. Summary
  10. Aggregating and Restructuring Data
    1. Using aggregation to add fields to a file
      1. Using aggregated variables to create new fields
    2. Aggregating up one level
      1. Preparing the data for aggregation
    3. Second level aggregation
      1. Preparing aggregated data for further use
    4. Matching the aggregated file back to find specific records
    5. Restructuring rows to columns
      1. Patient test data example
      2. Performing calculations following data restructuring
    6. Summary
  11. Crosstabulation Patterns for Categorical Data
    1. Percentages in crosstabs
      1. Testing differences in column proportions
        1. Crosstab pivot table editing
        2. Adding a layer variable
        3. Adding a second layer
      2. Using a Chi-square test with crosstabs
        1. Expected counts
        2. Context sensitive help
      3. Ordinal measures of association
      4. Interval with nominal association measure
      5. Nominal measures of association
    2. Summary
  12. Comparing Means and ANOVA
    1. SPSS procedures for comparing Means
      1. The Means procedure
        1. Adding a second variable 
        2. Test of linearity example
        3. Testing the strength of the nonlinear relationship
      2. Single sample t-test
      3. The independent samples t-test
      4. Homogeneity of variance test
        1. Comparing subsets
      5. Paired t-test
        1. Paired t-test split by gender
      6. One-way analysis of variance
        1. Brown-Forsythe and Welch statistics
        2. Planned comparisons
    2. Post hoc comparisons
      1. The ANOVA procedure
    3. Summary
  13. Correlations
    1. Pearson correlations
      1. Testing for significance
      2. Mean differences versus correlations
    2. Listwise versus pairwise missing values
      1. Comparing pairwise and listwise correlation matrices
    3. Pivoting table editing to enhance correlation matrices
      1. Creating a very trimmed matrix
    4. Visualizing correlations with scatterplots
    5. Rank order correlations
    6. Partial correlations
      1. Adding a second control variable
    7. Summary
  14. Linear Regression
    1. Assumptions of the classical linear regression model
    2. Example - motor trend car data
      1. Exploring associations between the target and predictors
      2. Fitting and interpreting a simple regression model
      3. Residual analysis for the simple regression model
      4. Saving and interpreting casewise diagnostics
    3. Multiple regression - Model-building strategies
    4. Summary
  15. Principal Components and Factor Analysis
    1. Choosing between principal components analysis and factor analysis
    2. PCA example - violent crimes
      1. Simple descriptive analysis
      2. SPSS code - principal components analysis
      3. Assessing factorability of the data
      4. Principal components analysis of the crime variables
      5. Principal component analysis – two-component solution
    3. Factor analysis - abilities
      1. The reduced correlation matrix and its eigenvalues
      2. Factor analysis code
      3. Factor analysis results
    4. Summary
  16. Clustering
    1. Overview of cluster analysis
    2. Overview of SPSS Statistics cluster analysis procedures
    3. Hierarchical cluster analysis example
      1. Descriptive analysis
      2. Cluster analysis - first attempt
      3. Cluster analysis with four clusters
    4. K-means cluster analysis example
      1. Descriptive analysis
      2. K-means cluster analysis of the Old Faithful data
      3. Further cluster profiling
      4. Other analyses to try
    5. Twostep cluster analysis example
    6. Summary
  17. Discriminant Analysis
    1. Descriptive discriminant analysis
    2. Predictive discriminant analysis
    3. Assumptions underlying discriminant analysis
    4. Example data
    5. Statistical and graphical summary of the data
    6. Discriminant analysis setup - key decisions
      1. Priors
      2. Pooled or separate
      3. Dimensionality
      4. Syntax for the wine example
    7. Examining the results
    8. Scoring new observations
    9. Summary