Multiple Imputation of Missing Data Using SAS

Book description


Find guidance on using SAS for multiple imputation and solving common missing data issues.

Multiple Imputation of Missing Data Using SAS provides both theoretical background and constructive solutions for those working with incomplete data sets in an engaging example-driven format. It offers practical instruction on the use of SAS for multiple imputation and provides numerous examples that use a variety of public release data sets with applications to survey data.

Written for users with an intermediate background in SAS programming and statistics, this book is an excellent resource for anyone seeking guidance on multiple imputation. The authors cover the MI and MIANALYZE procedures in detail, along with other procedures used for analysis of complete data sets. They guide analysts through the multiple imputation process, including evaluation of missing data patterns, choice of an imputation method, execution of the process, and interpretation of results.

Topics discussed include how to deal with missing data problems in a statistically appropriate manner, how to intelligently select an imputation method, how to incorporate the uncertainty introduced by the imputation process, and how to incorporate the complex sample design (if appropriate) through use of the SAS SURVEY procedures.

Discover the theoretical background and see extensive applications of the multiple imputation process in action.

This book is part of the SAS Press program.

Table of contents

  1. About This Book
  2. About The Authors
  3. Acknowledgements
  4. Chapter 1: Introduction to Missing Data and Methods for Analyzing Data with Missing Values
    1. 1.1 Introduction
    2. 1.2 Sources and Patterns of Item Missing Data
    3. 1.3 Item Missing Data Mechanisms
    4. 1.4 Review of Strategies to Address Item Missing Data
      1. 1.4.1 Complete Case Analysis
      2. 1.4.2 Complete Case Analysis with Weighting Adjustments
      3. 1.4.3 Full Information Maximum Likelihood
      4. 1.4.4 Expectation-Maximization Algorithm
      5. 1.4.5 Single Imputation of Missing Values
      6. 1.4.6 Multiple Imputation
    5. 1.5 Outline of Book Chapters
    6. 1.6 Overview of Analysis Examples
  5. Chapter 2: Introduction to Multiple Imputation Theory and Methods
    1. 2.1 The Origins and Properties of Multiple Imputation Methods for Missing Data
      1. 2.1.1 A Short History of Imputation Methods
      2. 2.1.2 Why the Multiple Imputation Method?
      3. 2.1.3 Overview of Multiple Imputation Steps
    2. 2.2 Step 1—Defining the Imputation Model
      1. 2.2.1 Choosing the Variables to Include in the Imputation Model
      2. 2.2.2 Distributional Assumptions for the Imputation Model
    3. 2.3 Algorithms for the Multiple Imputation of Missing Values
      1. 2.3.1 General Theory for Multiple Imputation Algorithms
      2. 2.3.2 Methods for Monotone Missing Data Patterns
      3. 2.3.3 Methods for Arbitrary Missing Data Patterns
    4. 2.4 Step 2–Analysis of the MI Completed Data Sets
    5. 2.5 Step 3–Estimation and Inference for Multiply Imputed Data Sets
      1. 2.5.1 Multiple Imputation–Estimators and Variances for Descriptive Statistics and Model Parameters
      2. 2.5.2 Multiple Imputation–Confidence Intervals
    6. 2.6 MI Procedures for Multivariate Inference
      1. 2.6.1 Multiple Parameter Hypothesis Tests
      2. 2.6.2 Tests of Linear Hypotheses
    7. 2.7 How Many Multiple Imputation Repetitions Are Needed?
    8. 2.8 Summary
  6. Chapter 3: Preparation for Multiple Imputation
    1. 3.1 Planning the Imputation Session
    2. 3.2 Choosing the Variables to Include in a Multiple Imputation
    3. 3.3 Amount and Pattern of Missing Data
    4. 3.4 Types of Variables to Be Imputed
    5. 3.5 Imputation Methods
    6. 3.6 Number of Imputations (MI Repetitions)
    7. 3.7 Overview of Multiple Imputation Procedures
    8. 3.8 Multiple Imputation Example
    9. 3.9 Summary
  7. Chapter 4: Multiple Imputation for the Analyzsis of Complex Sample Survey Data 49
    1. 4.1 Multiple Imputation and Informative Data Collection Designs
    2. 4.2 Complex Sample Surveys
    3. 4.3 Incorporating the Complex Sample Design in the MI Imputation Step
    4. 4.4 Incorporating the Complex Sample Design in the MI Analysis and Inference Steps
    5. 4.5 MI Imputation and Analysis for Subpopulations of Complex Sample Design Data Sets
    6. 4.6 Summary
  8. Chapter 5: Multiple Imputation of Continuous Variables
    1. 5.1 Introduction to Multiple Imputation of Continuous Variables
    2. 5.2 Imputation of Continuous Variables with Arbitrary Missing Data
    3. 5.3 Imputation of Continuous Variables with Mixed Covariates and a Monotone Missing Data Pattern Using the Regression and Predictive Mean Matching Methods
      1. 5.3.1 Imputation of Continuous Variables with Mixed Covariates and a Monotone Missing Data Pattern Using the Regression Method
      2. 5.3.2 Imputation of Continuous Variables with Mixed Covariates and a Monotone Missing Data Pattern Using the Predictive Mean Matching Method
    4. 5.4 Imputation of Continuous Variables with an Arbitrary Missing Data Pattern and Mixed Covariates Using the FCS Method
      1. 5.4.1 Imputation of Continuous Variables with an Arbitrary Missing Data Pattern and Mixed Covariates Using the FCS Method
    5. 5.5 Summary
  9. Chapter 6: Multiple Imputation of Classification Variables
    1. 6.1 Introduction to Multiple Imputation of Classification Variables
    2. 6.2 Imputation of a Classification Variable with a Monotone Missing Data Pattern Using the Logistic Method
    3. 6.3 Imputation of Classification Variables with an Arbitrary Missing Data Pattern and Mixed Covariates Using the FCS Discriminant Function and the FCS Logistic Regression Method
    4. 6.4 Imputation of Classification Variables with an Arbitrary Missing Data Pattern and Mixed Covariates: A Comparison of the FCS and MCMC/Monotone Methods
      1. 6.4.1 Imputation of Classification Variables with Mixed Covariates and an Arbitrary Missing Data Pattern Using the FCS Method
      2. 6.4.2 Imputation of Classification Variables with Mixed Covariates and an Arbitrary Missing Data Pattern Using the MCMC/Monotone and Monotone Logistic Methods with a Multistep Approach
    5. 6.5 Summary
  10. Chapter 7: Multiple Imputation Case Studies
    1. 7.1 Multiple Imputation Case Studies
    2. 7.2 Comparative Analysis of HRS 2006 Data Using Complete Case Analysis and Multiple Imputation of Missing Data
      1. 7.2.1 Exploration of Missing Data
      2. 7.2.2 Complete Case Analysis Using PROC SURVEYLOGISTIC
      3. 7.2.3 Multiple Imputation of Missing Data with an Arbitrary Missing Data Pattern Using the FCS Method with Diagnostic Trace Plots
      4. 7.2.4 Logistic Regression Analysis of Imputed Data Sets Using PROC SURVEYLOGISTIC
      5. 7.2.5 Use of PROC MIANALYZE with Logistic Regression Output
      6. 7.2.6 Comparison of Complete Case Analysis and Multiply Imputed Analysis
    3. 7.3 Imputation and Analysis of Longitudinal Seizure Data
      1. 7.3.1 Introduction to the Seizure Data
      2. 7.3.2 Exploratory Analysis of Seizure Data
      3. 7.3.3 Conversion of Multiple-Record to Single-Record Data
      4. 7.3.4 Multiple Imputation of Missing Data
      5. 7.3.5 Conversion Back to Multiple Record Data for Analysis of Imputed Data Sets
      6. 7.3.6 Regression Analysis of Imputed Data Sets
    4. 7.4 Summary
  11. Chapter 8: Preparation of Data Sets for PROC MIANALYZE
    1. 8.1 Preparation of Data Sets for Use in PROC MIANALYZE
    2. 8.2 Imputation of Major League Baseball Players’ Salaries
      1. 8.3.1 PROC GLM Output Data Set for Use in PROC MIANALYZE
      2. 8.3.2 PROC MIXED Output Data Set for Use in PROC MIANALYZE
    3. 8.4 Imputation of NCS-R Data
    4. 8.5 PROC SURVEYPHREG Output Data Set for Use in PROC MIANALYZE
    5. 8.6 Summary
  12. References
  13. Index

Product information

  • Title: Multiple Imputation of Missing Data Using SAS
  • Author(s): Patricia Berglund, Steven G. Heeringa
  • Release date: July 2014
  • Publisher(s): SAS Institute
  • ISBN: 9781629592039