Fighting Churn with Data

Book description

The beating heart of any product or service business is returning clients. Don't let your hard-won customers vanish, taking their money with them. In Fighting Churn with Data you'll learn powerful data-driven techniques to maximize customer retention and minimize actions that cause them to stop engaging or unsubscribe altogether. This hands-on guide is packed with techniques for converting raw data into measurable metrics, testing hypotheses, and presenting findings that are easily understandable to non-technical decision makers.

About the Technology
Keeping customers active and engaged is essential for any business that relies on recurring revenue and repeat sales. Customer turnover—or “churn”—is costly, frustrating, and preventable. By applying the techniques in this book, you can identify the warning signs of churn and learn to catch customers before they leave.

About the Book
Fighting Churn with Data teaches developers and data scientists proven techniques for stopping churn before it happens. Packed with real-world use cases and examples, this book teaches you to convert raw data into measurable behavior metrics, calculate customer lifetime value, and improve churn forecasting with demographic data. By following Zuora Chief Data Scientist Carl Gold’s methods, you’ll reap the benefits of high customer retention.

What's Inside
  • Calculating churn metrics
  • Identifying user behavior that predicts churn
  • Using churn reduction tactics with customer segmentation
  • Applying churn analysis techniques to other business areas
  • Using AI for accurate churn forecasting


About the Reader
For readers with basic data analysis skills, including Python and SQL.

About the Author
Carl Gold is a Senior Data Science Manager for financial startup Migo.money. He has previously worked as Chief Data Scientist for Zuora, the industry-leading subscription management platform. He has a Ph.D. from the California Institute of Technology.

Quotes
This book is a rarity. Lucid, compelling, and even funny. Mandatory reading for anyone running a subscription-based business. Buy a copy for your boss.
- From the Foreword by Tien Tzuo, Founder and CEO of Zuora, Inc.

A must-have weapon. . . . This comprehensive guide provides deep insights on churn analysis with step-by-step examples.
- Kelum Prabath Senanayake, Echoworx

A great exploration of churn, richly packed with theory and great code samples.
- George Thomas, Manhattan Associates

Churns out almost everything related to churn. Packed with lucid language, detailed explanations, and scrutiny of a real-life case study.
- Prabhuti Prakash, Synechro

Publisher resources

View/Submit Errata

Table of contents

  1. Fighting Churn with Data
  2. Copyright
  3. brief contents
  4. contents
  5. front matter
    1. foreword
    2. preface
    3. acknowledgments
    4. about this book
    5. Who should read this book
    6. How this book is organized: A road map
    7. About the code
    8. liveBook discussion forum
    9. Other online resources
    10. about the author
    11. about the cover illustration
  6. Part 1. Building your arsenal
  7. 1 The world of churn
    1. 1.1 Why you are reading this book
      1. 1.1.1 The typical churn scenario
      2. 1.1.2 What this book is about
    2. 1.2 Fighting churn
      1. 1.2.1 Interventions that reduce churn
      2. 1.2.2 Why churn is hard to fight
      3. 1.2.3 Great customer metrics: Weapons in the fight against churn
    3. 1.3 Why this book is different
      1. 1.3.1 Practical and in-depth
      2. 1.3.2 Simulated case study
    4. 1.4 Products with recurring user interactions
      1. 1.4.1 Paid consumer products
      2. 1.4.2 Business-to-business services
      3. 1.4.3 Ad-supported media and apps
      4. 1.4.4 Consumer feed subscriptions
      5. 1.4.5 Freemium business models
      6. 1.4.6 In-app purchase models
    5. 1.5 Nonsubscription churn scenarios
      1. 1.5.1 Inactivity as churn
      2. 1.5.2 Free trial conversion
      3. 1.5.3 Upsell/down sell
      4. 1.5.4 Other yes/no (binary) customer predictions
      5. 1.5.5 Customer activity predictions
      6. 1.5.6 Use cases that are not like churn
    6. 1.6 Customer behavior data
      1. 1.6.1 Customer events in common product categories
      2. 1.6.2 The most important events
    7. 1.7 Case studies in fighting churn
      1. 1.7.1 Klipfolio
      2. 1.7.2 Broadly
      3. 1.7.3 Versature
      4. 1.7.4 Social network simulation
    8. 1.8 Case studies in great customer metrics
      1. 1.8.1 Utilization
      2. 1.8.2 Success rates
      3. 1.8.3 Unit cost
    9. Summary
  8. 2 Measuring churn
    1. 2.1 Definition of the churn rate
      1. 2.1.1 Calculating the churn rate and retention rate
      2. 2.1.2 The relationship between churn rate and retention rate
    2. 2.2 Subscription databases
    3. 2.3 Basic churn calculation: Net retention
      1. 2.3.1 Net retention calculation
      2. 2.3.2 SQL net retention calculation
      3. 2.3.3 Interpreting net retention
    4. 2.4 Standard account-based churn
      1. 2.4.1 Standard churn rate definition
      2. 2.4.2 Outer joins for churn calculation
      3. 2.4.3 Standard churn calculation with SQL
      4. 2.4.4 When to use the standard churn rate
    5. 2.5 Activity (event-based) churn for nonsubscription products
      1. 2.5.1 Defining an active account and churn from events
      2. 2.5.2 Activity churn calculations with SQL
    6. 2.6 Advanced churn: Monthly recurring revenue (MRR) churn
      1. 2.6.1 MRR churn definition and calculation
      2. 2.6.2 MRR churn calculation with SQL
      3. 2.6.3 MRR churn vs. account churn vs. net (retention) churn
    7. 2.7 Churn rate measurement conversion
      1. 2.7.1 Survivor analysis (advanced)
      2. 2.7.2 Churn rate conversions
      3. 2.7.3 Converting any churn measurement window in SQL
      4. 2.7.4 Picking the churn measurement window
      5. 2.7.5 Seasonality and churn rates
    8. Summary
  9. 3 Measuring customers
    1. 3.1 From events to metrics
    2. 3.2 Event data warehouse schema
    3. 3.3 Counting events in one time period
    4. 3.4 Details of metric period definitions
      1. 3.4.1 Weekly behavioral cycles
      2. 3.4.2 Timestamps for metric measurements
    5. 3.5 Making measurements at different points in time
      1. 3.5.1 Overlapping measurement windows
      2. 3.5.2 Timing metric measurements
      3. 3.5.3 Saving metric measurements
      4. 3.5.4 Saving metrics for the simulation examples
    6. 3.6 Measuring totals and averages of event properties
    7. 3.7 Metric quality assurance
      1. 3.7.1 Testing how metrics change over time
      2. 3.7.2 Metric quality assurance (QA) case studies
      3. 3.7.3 Checking how many accounts receive metrics
    8. 3.8 Event QA
      1. 3.8.1 Checking how events change over time
      2. 3.8.2 Checking events per account
    9. 3.9 Selecting the measurement period for behavioral measurements
    10. 3.10 Measuring account tenure
      1. 3.10.1 Account tenure definition
      2. 3.10.2 Recursive table expressions for account tenure
      3. 3.10.3 Account tenure SQL program
    11. 3.11 Measuring MRR and other subscription metrics
      1. 3.11.1 Calculating MRR as a metric
      2. 3.11.2 Subscriptions for specific amounts
      3. 3.11.3 Calculating subscription unit quantities as metrics
      4. 3.11.4 Calculating the billing period as a metric
    12. Summary
  10. 4 Observing renewal and churn
    1. 4.1 Introduction to datasets
    2. 4.2 How to observe customers
      1. 4.2.1 Observation lead time
      2. 4.2.2 Observing sequences of renewals and a churn
      3. 4.2.3 Overview of creating a dataset from subscriptions
    3. 4.3 Identifying active periods from subscriptions
      1. 4.3.1 Active periods
      2. 4.3.2 Schema for storing active periods
      3. 4.3.3 Finding active periods that are ongoing
      4. 4.3.4 Finding active periods ending in churn
    4. 4.4 Identifying active periods for nonsubscription products
      1. 4.4.1 Active period definition
      2. 4.4.2 Process for forming datasets from events
      3. 4.4.3 SQL for calculating active weeks
    5. 4.5 Picking observation dates
      1. 4.5.1 Balancing churn and nonchurn observations
      2. 4.5.2 Observation date-picking algorithm
      3. 4.5.3 Observation date SQL program
    6. 4.6 Exporting a churn dataset
      1. 4.6.1 Dataset creation SQL program
    7. 4.7 Exporting the current customers for segmentation
      1. 4.7.1 Selecting active accounts and metrics
      2. 4.7.2 Segmenting customers by their metrics
    8. Summary
  11. Part 2. Waging the war
  12. 5 Understanding churn and behavior with metrics
    1. 5.1 Metric cohort analysis
      1. 5.1.1 The idea behind cohort analysis
      2. 5.1.2 Cohort analysis with Python
      3. 5.1.3 Cohorts of product use
      4. 5.1.4 Cohorts of account tenure
      5. 5.1.5 Cohort analysis of billing period
      6. 5.1.6 Minimum cohort size
      7. 5.1.7 Significant and insignificant cohort differences
      8. 5.1.8 Metric cohorts with a majority of zero customer metrics
      9. 5.1.9 Causality: Are the metrics causing churn?
    2. 5.2 Summarizing customer behavior
      1. 5.2.1 Understanding the distribution of the metrics
      2. 5.2.2 Calculating dataset summary statistics in Python
      3. 5.2.3 Screening rare metrics
      4. 5.2.4 Involving the business in data quality assurance
    3. 5.3 Scoring metrics
      1. 5.3.1 The idea behind metric scores
      2. 5.3.2 The metric score algorithm
      3. 5.3.3 Calculating metric scores in Python
      4. 5.3.4 Cohort analysis with scored metrics
      5. 5.3.5 Cohort analysis of monthly recurring revenue
    4. 5.4 Removing unwanted or invalid observations
      1. 5.4.1 Removing nonpaying customers from churn analysis
      2. 5.4.2 Removing observations based on metric thresholds in Python
      3. 5.4.3 Removing zero measurements from rare metric analyses
      4. 5.4.4 Disengaging behaviors: Metrics associated with increasing churn
    5. 5.5 Segmenting customers by using cohort analysis
      1. 5.5.1 Segmenting process
      2. 5.5.2 Choosing segment criteria
    6. Summary
  13. 6 Relationships between customer behaviors
    1. 6.1 Correlation between behaviors
      1. 6.1.1 Correlation between pairs of metrics
      2. 6.1.2 Investigating correlations with Python
      3. 6.1.3 Understanding correlations between sets of metrics with correlation matrices
      4. 6.1.4 Case study correlation matrices
      5. 6.1.5 Calculating correlation matrices in Python
    2. 6.2 Averaging groups of behavioral metrics
      1. 6.2.1 Why you average correlated metric scores
      2. 6.2.2 Averaging scores with a matrix of weights (loading matrix)
      3. 6.2.3 Case study for loading matrices
      4. 6.2.4 Applying a loading matrix in Python
      5. 6.2.5 Churn cohort analysis on metric group average scores
    3. 6.3 Discovering groups of correlated metrics
      1. 6.3.1 Grouping metrics by clustering correlations
      2. 6.3.2 Clustering correlations in Python
      3. 6.3.3 Loading matrix weights that make the average of scores a score
      4. 6.3.4 Running the metric grouping and grouped cohort analysis listings
      5. 6.3.5 Picking the correlation threshold for clustering
    4. 6.4 Explaining correlated metric groups to businesspeople
    5. Summary
  14. 7 Segmenting customers with advanced metrics
    1. 7.1 Ratio metrics
      1. 7.1.1 When to use ratio metrics and why
      2. 7.1.2 How to calculate ratio metrics
      3. 7.1.3 Ratio metric case study examples
      4. 7.1.4 Additional ratio metrics for the simulated social network
    2. 7.2 Percentage of total metrics
      1. 7.2.1 Calculating percentage of total metrics
      2. 7.2.2 Percentage of total metric case study with two metrics
      3. 7.2.3 Percentage of total metrics case study with multiple metrics
    3. 7.3 Metrics that measure change
      1. 7.3.1 Measuring change in the level of activity
      2. 7.3.2 Scores for metrics with extreme outliers (fat tails)
      3. 7.3.3 Measuring the time since the last activity
    4. 7.4 Scaling metric time periods
      1. 7.4.1 Scaling longer metrics to shorter quoting periods
      2. 7.4.2 Estimating metrics for new accounts
    5. 7.5 User metrics
      1. 7.5.1 Measuring active users
      2. 7.5.2 Active user metrics
    6. 7.6 Which ratios to use
      1. 7.6.1 Why use ratios, and what else is there?
      2. 7.6.2 Which ratios to use?
    7. Summary
  15. Part 3. Special weapons and tactics
  16. 8 Forecasting churn
    1. 8.1 Forecasting churn with a model
      1. 8.1.1 Probability forecasts with a model
      2. 8.1.2 Engagement and retention probability
      3. 8.1.3 Engagement and customer behavior
      4. 8.1.4 An offset matches observed churn rates to the S curve
      5. 8.1.5 The logistic regression probability calculation
    2. 8.2 Reviewing data preparation
    3. 8.3 Fitting a churn model
      1. 8.3.1 Results of logistic regression
      2. 8.3.2 Logistic regression code
      3. 8.3.3 Explaining logistic regression results
      4. 8.3.4 Logistic regression case study
      5. 8.3.5 Calibration and historical churn probabilities
    4. 8.4 Forecasting churn probabilities
      1. 8.4.1 Preparing the current customer dataset for forecasting
      2. 8.4.2 Preparing the current customer data for segmenting
      3. 8.4.3 Forecasting with a saved model
      4. 8.4.4 Forecasting case studies
      5. 8.4.5 Forecast calibration and forecast drift
    5. 8.5 Pitfalls of churn forecasting
      1. 8.5.1 Correlated metrics
      2. 8.5.2 Outliers
    6. 8.6 Customer lifetime value
      1. 8.6.1 The meaning(s) of CLV
      2. 8.6.2 From churn to expected customer lifetime
      3. 8.6.3 CLV formulas
    7. Summary
  17. 9 Forecast accuracy and machine learning
    1. 9.1 Measuring the accuracy of churn forecasts
      1. 9.1.1 Why you don’t use the standard accuracy measurement for churn
      2. 9.1.2 Measuring churn forecast accuracy with the AUC
      3. 9.1.3 Measuring churn forecast accuracy with the lift
    2. 9.2 Historical accuracy simulation: Backtesting
      1. 9.2.1 What and why of backtesting
      2. 9.2.2 Backtesting code
      3. 9.2.3 Backtesting considerations and pitfalls
    3. 9.3 The regression control parameter
      1. 9.3.1 Controlling the strength and number of regression weights
      2. 9.3.2 Regression with the control parameter
    4. 9.4 Picking the regression parameter by testing (cross-validation)
      1. 9.4.1 Cross-validation
      2. 9.4.2 Cross-validation code
      3. 9.4.3 Regression cross-validation case studies
    5. 9.5 Forecasting churn risk with machine learning
      1. 9.5.1 The XGBoost learning model
      2. 9.5.2 XGBoost cross-validation
      3. 9.5.3 Comparison of XGBoost accuracy to regression
      4. 9.5.4 Comparison of advanced and basic metrics
    6. 9.6 Segmenting customers with machine learning forecasts
    7. Summary
  18. 10 Churn demographics and firmographics
    1. 10.1 Demographic and firmographic datasets
      1. 10.1.1 Types of demographic and firmographic data
      2. 10.1.2 Account data model for the social network simulation
      3. 10.1.3 Demographic dataset SQL
    2. 10.2 Churn cohorts with demographic and firmographic categories
      1. 10.2.1 Churn rate cohorts for demographic categories
      2. 10.2.2 Churn rate confidence intervals
      3. 10.2.3 Comparing demographic cohorts with confidence intervals
    3. 10.3 Grouping demographic categories
      1. 10.3.1 Representing groups with a mapping dictionary
      2. 10.3.2 Cohort analysis with grouped categories
      3. 10.3.3 Designing category groups
    4. 10.4 Churn analysis for date- and numeric-based demographics
    5. 10.5 Churn forecasting with demographic data
      1. 10.5.1 Converting text fields to dummy variables
      2. 10.5.2 Forecasting churn with categorical dummy variables alone
      3. 10.5.3 Combining dummy variables with numeric data
      4. 10.5.4 Forecasting churn with demographic and metrics combined
    6. 10.6 Segmenting current customers with demographic data
    7. Summary
  19. 11 Leading the fight against churn
    1. 11.1 Planning your own fight against churn
      1. 11.1.1 Data processing and analysis checklist
      2. 11.1.2 Communication to the business checklist
    2. 11.2 Running the book listings on your own data
      1. 11.2.1 Loading your data into this book’s data schema
      2. 11.2.2 Running the listings on your own data
    3. 11.3 Porting this book’s listings to different environments
      1. 11.3.1 Porting the SQL listings
      2. 11.3.2 Porting the Python listings
    4. 11.4 Learning more and keeping in touch
      1. 11.4.1 Author’s blog site and social media
      2. 11.4.2 Sources for churn benchmark information
      3. 11.4.3 Other sources of information about churn
      4. 11.4.4 Products that help with churn
    5. Summary
  20. index

Product information

  • Title: Fighting Churn with Data
  • Author(s): Carl Gold
  • Release date: December 2020
  • Publisher(s): Manning Publications
  • ISBN: 9781617296529