Behavioral Data Analysis with R and Python

Book description

Harness the full power of the behavioral data in your company by learning tools specifically designed for behavioral data analysis. Common data science algorithms and predictive analytics tools treat customer behavioral data, such as clicks on a website or purchases in a supermarket, the same as any other data. Instead, this practical guide introduces powerful methods specifically tailored for behavioral data analysis.

Advanced experimental design helps you get the most out of your A/B tests, while causal diagrams allow you to tease out the causes of behaviors even when you can't run experiments. Written in an accessible style for data scientists, business analysts, and behavioral scientists, thispractical book provides complete examples and exercises in R and Python to help you gain more insight from your data--immediately.

  • Understand the specifics of behavioral data
  • Explore the differences between measurement and prediction
  • Learn how to clean and prepare behavioral data
  • Design and analyze experiments to drive optimal business decisions
  • Use behavioral data to understand and measure cause and effect
  • Segment customers in a transparent and insightful way

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Who This Book Is For
    2. Who This Book Is Not For
    3. R and Python Code
      1. Code Environments
      2. Code Conventions
      3. Functional-Style Programming 101
      4. Using Code Examples
    4. Navigating This Book
    5. Conventions Used in This Book
    6. O’Reilly Online Learning
    7. How to Contact Us
    8. Acknowledgments
  2. I. Understanding Behaviors
  3. 1. The Causal-Behavioral Framework for Data Analysis
    1. Why We Need Causal Analytics to Explain Human Behavior
      1. The Different Types of Analytics
      2. Human Beings Are Complicated
    2. Confound It! The Hidden Dangers of Letting Regression Sort It Out
      1. Data
      2. Why Correlation Is Not Causation: A Confounder in Action
      3. Too Many Variables Can Spoil the Broth
    3. Conclusion
  4. 2. Understanding Behavioral Data
    1. A Basic Model of Human Behavior
      1. Personal Characteristics
      2. Cognition and Emotions
      3. Intentions
      4. Actions
      5. Business Behaviors
    2. How to Connect Behaviors and Data
      1. Develop a Behavioral Integrity Mindset
      2. Distrust and Verify
      3. Identify the Category
      4. Refine Behavioral Variables
      5. Understand the Context
    3. Conclusion
  5. II. Causal Diagrams and Deconfounding
  6. 3. Introduction to Causal Diagrams
    1. Causal Diagrams and the Causal-Behavioral Framework
      1. Causal Diagrams Represent Behaviors
      2. Causal Diagrams Represent Data
    2. Fundamental Structures of Causal Diagrams
      1. Chains
      2. Forks
      3. Colliders
    3. Common Transformations of Causal Diagrams
      1. Slicing/Disaggregating Variables
      2. Aggregating Variables
      3. What About Cycles?
      4. Paths
    4. Conclusion
  7. 4. Building Causal Diagrams from Scratch
    1. Business Problem and Data Setup
      1. Data and Packages
      2. Understanding the Relationship of Interest
    2. Identify Candidate Variables to Include
      1. Actions
      2. Intentions
      3. Cognition and Emotions
      4. Personal Characteristics
      5. Business Behaviors
      6. Time Trends
    3. Validate Observable Variables to Include Based on Data
      1. Relationships Between Numeric Variables
      2. Relationships Between Categorical Variables
      3. Relationships Between Numeric and Categorical Variables
    4. Expand Causal Diagram Iteratively
      1. Identify Proxies for Unobserved Variables
      2. Identify Further Causes
      3. Iterate
    5. Simplify Causal Diagram
    6. Conclusion
  8. 5. Using Causal Diagrams to Deconfound Data Analyses
    1. Business Problem: Ice Cream and Bottled Water Sales
    2. The Disjunctive Cause Criterion
      1. Definition
      2. First Block
      3. Second Block
    3. The Backdoor Criterion
      1. Definitions
      2. First Block
      3. Second Block
    4. Conclusion
  9. III. Robust Data Analysis
  10. 6. Handling Missing Data
    1. Data and Packages
    2. Visualizing Missing Data
      1. Amount of Missing Data
      2. Correlation of Missingness
    3. Diagnosing Missing Data
      1. Causes of Missingness: Rubin’s Classification
      2. Diagnosing MCAR Variables
      3. Diagnosing MAR Variables
      4. Diagnosing MNAR Variables
      5. Missingness as a Spectrum
    4. Handling Missing Data
      1. Introduction to Multiple Imputation (MI)
      2. Default Imputation Method: Predictive Mean Matching
      3. From PMM to Normal Imputation (R Only)
      4. Adding Auxiliary Variables
      5. Scaling Up the Number of Imputed Data Sets
    5. Conclusion
  11. 7. Measuring Uncertainty with the Bootstrap
    1. Intro to the Bootstrap: “Polling” Oneself Up
      1. Packages
      2. The Business Problem: Small Data with an Outlier
      3. Bootstrap Confidence Interval for the Sample Mean
      4. Bootstrap Confidence Intervals for Ad Hoc Statistics
    2. The Bootstrap for Regression Analysis
    3. When to Use the Bootstrap
      1. Conditions for the Traditional Central Estimate to Be Sufficient
      2. Conditions for the Traditional CI to Be Sufficient
      3. Determining the Number of Bootstrap Samples
    4. Optimizing the Bootstrap in R and Python
      1. R: The BehavioralDataAnalysis Package
      2. Python Optimization
    5. Conclusion
  12. IV. Designing and Analyzing Experiments
  13. 8. Experimental Design: The Basics
    1. Planning the Experiment: Theory of Change
      1. Business Goal and Target Metric
      2. Intervention
      3. Behavioral Logic
    2. Data and Packages
    3. Determining Random Assignment and Sample Size/Power
      1. Random Assignment
      2. Sample Size and Power Analysis
    4. Analyzing and Interpreting Experimental Results
    5. Conclusion
  14. 9. Stratified Randomization
    1. Planning the Experiment
      1. Business Goal and Target Metric
      2. Definition of the Intervention
      3. Behavioral Logic
      4. Data and Packages
    2. Determining Random Assignment and Sample Size/Power
      1. Random Assignment
      2. Power Analysis with Bootstrap Simulations
    3. Analyzing and Interpreting Experimental Results
      1. Intention-to-Treat Estimate for Encouragement Intervention
      2. Complier Average Causal Estimate for Mandatory Intervention
    4. Conclusion
  15. 10. Cluster Randomization and Hierarchical Modeling
    1. Planning the Experiment
      1. Business Goal and Target Metric
      2. Definition of the Intervention
      3. Behavioral Logic
    2. Data and Packages
    3. Introduction to Hierarchical Modeling
      1. R Code
      2. Python Code
    4. Determining Random Assignment and Sample Size/Power
      1. Random Assignment
      2. Power Analysis
    5. Analyzing the Experiment
    6. Conclusion
  16. V. Advanced Tools in Behavioral Data Analysis
  17. 11. Introduction to Moderation
    1. Data and Packages
    2. Behavioral Varieties of Moderation
      1. Segmentation
      2. Interactions
      3. Nonlinearities
    3. How to Apply Moderation
      1. When to Look for Moderation?
      2. Multiple Moderators
      3. Validating Moderation with Bootstrap
      4. Interpreting Individual Coefficients
    4. Conclusion
  18. 12. Mediation and Instrumental Variables
    1. Mediation
      1. Understanding Causal Mechanisms
      2. Causal Biases
      3. Identifying Mediation
      4. Measuring Mediation
    2. Instrumental Variables
      1. Data
      2. Packages
      3. Understanding and Applying IVs
      4. Measurement
      5. Applying IVs: Frequently Asked Questions
    3. Conclusion
  19. Bibliography
  20. Index
  21. About the Author

Product information

  • Title: Behavioral Data Analysis with R and Python
  • Author(s): Florent Buisson
  • Release date: June 2021
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492061373