book

Fighting Churn with Data

Name: Fighting Churn with Data
Author: Carl Gold
ISBN: 9781617296529

by Carl Gold

December 2020

Intermediate to advanced

504 pages

15h 39m

English

Manning Publications

Read now

Unlock full access

Fighting Churn with Data
Copyright
brief contents
contents
front matter
forewordprefaceacknowledgmentsabout this bookWho should read this bookHow this book is organized: A road mapAbout the codeliveBook discussion forumOther online resourcesabout the authorabout the cover illustration
Part 1. Building your arsenal
1 The world of churn
1.1 Why you are reading this book1.1.1 The typical churn scenario1.1.2 What this book is about1.2 Fighting churn1.2.1 Interventions that reduce churn1.2.2 Why churn is hard to fight1.2.3 Great customer metrics: Weapons in the fight against churn1.3 Why this book is different1.3.1 Practical and in-depth1.3.2 Simulated case study1.4 Products with recurring user interactions1.4.1 Paid consumer products1.4.2 Business-to-business services1.4.3 Ad-supported media and apps1.4.4 Consumer feed subscriptions1.4.5 Freemium business models1.4.6 In-app purchase models1.5 Nonsubscription churn scenarios1.5.1 Inactivity as churn1.5.2 Free trial conversion1.5.3 Upsell/down sell1.5.4 Other yes/no (binary) customer predictions1.5.5 Customer activity predictions1.5.6 Use cases that are not like churn1.6 Customer behavior data1.6.1 Customer events in common product categories1.6.2 The most important events1.7 Case studies in fighting churn1.7.1 Klipfolio1.7.2 Broadly1.7.3 Versature1.7.4 Social network simulation1.8 Case studies in great customer metrics1.8.1 Utilization1.8.2 Success rates1.8.3 Unit costSummary
2 Measuring churn
2.1 Definition of the churn rate2.1.1 Calculating the churn rate and retention rate2.1.2 The relationship between churn rate and retention rate2.2 Subscription databases2.3 Basic churn calculation: Net retention2.3.1 Net retention calculation2.3.2 SQL net retention calculation2.3.3 Interpreting net retention2.4 Standard account-based churn2.4.1 Standard churn rate definition2.4.2 Outer joins for churn calculation2.4.3 Standard churn calculation with SQL2.4.4 When to use the standard churn rate2.5 Activity (event-based) churn for nonsubscription products2.5.1 Defining an active account and churn from events2.5.2 Activity churn calculations with SQL2.6 Advanced churn: Monthly recurring revenue (MRR) churn2.6.1 MRR churn definition and calculation2.6.2 MRR churn calculation with SQL2.6.3 MRR churn vs. account churn vs. net (retention) churn2.7 Churn rate measurement conversion2.7.1 Survivor analysis (advanced)2.7.2 Churn rate conversions2.7.3 Converting any churn measurement window in SQL2.7.4 Picking the churn measurement window2.7.5 Seasonality and churn ratesSummary
3 Measuring customers
3.1 From events to metrics3.2 Event data warehouse schema3.3 Counting events in one time period3.4 Details of metric period definitions3.4.1 Weekly behavioral cycles3.4.2 Timestamps for metric measurements3.5 Making measurements at different points in time3.5.1 Overlapping measurement windows3.5.2 Timing metric measurements3.5.3 Saving metric measurements3.5.4 Saving metrics for the simulation examples3.6 Measuring totals and averages of event properties3.7 Metric quality assurance3.7.1 Testing how metrics change over time3.7.2 Metric quality assurance (QA) case studies3.7.3 Checking how many accounts receive metrics3.8 Event QA3.8.1 Checking how events change over time3.8.2 Checking events per account3.9 Selecting the measurement period for behavioral measurements3.10 Measuring account tenure3.10.1 Account tenure definition3.10.2 Recursive table expressions for account tenure3.10.3 Account tenure SQL program3.11 Measuring MRR and other subscription metrics3.11.1 Calculating MRR as a metric3.11.2 Subscriptions for specific amounts3.11.3 Calculating subscription unit quantities as metrics3.11.4 Calculating the billing period as a metricSummary
4 Observing renewal and churn
4.1 Introduction to datasets4.2 How to observe customers4.2.1 Observation lead time4.2.2 Observing sequences of renewals and a churn4.2.3 Overview of creating a dataset from subscriptions4.3 Identifying active periods from subscriptions4.3.1 Active periods4.3.2 Schema for storing active periods4.3.3 Finding active periods that are ongoing4.3.4 Finding active periods ending in churn4.4 Identifying active periods for nonsubscription products4.4.1 Active period definition4.4.2 Process for forming datasets from events4.4.3 SQL for calculating active weeks4.5 Picking observation dates4.5.1 Balancing churn and nonchurn observations4.5.2 Observation date-picking algorithm4.5.3 Observation date SQL program4.6 Exporting a churn dataset4.6.1 Dataset creation SQL program4.7 Exporting the current customers for segmentation4.7.1 Selecting active accounts and metrics4.7.2 Segmenting customers by their metricsSummary

Part 2. Waging the war
5 Understanding churn and behavior with metrics
5.1 Metric cohort analysis5.1.1 The idea behind cohort analysis5.1.2 Cohort analysis with Python5.1.3 Cohorts of product use5.1.4 Cohorts of account tenure5.1.5 Cohort analysis of billing period5.1.6 Minimum cohort size5.1.7 Significant and insignificant cohort differences5.1.8 Metric cohorts with a majority of zero customer metrics5.1.9 Causality: Are the metrics causing churn?5.2 Summarizing customer behavior5.2.1 Understanding the distribution of the metrics5.2.2 Calculating dataset summary statistics in Python5.2.3 Screening rare metrics5.2.4 Involving the business in data quality assurance5.3 Scoring metrics5.3.1 The idea behind metric scores5.3.2 The metric score algorithm5.3.3 Calculating metric scores in Python5.3.4 Cohort analysis with scored metrics5.3.5 Cohort analysis of monthly recurring revenue5.4 Removing unwanted or invalid observations5.4.1 Removing nonpaying customers from churn analysis5.4.2 Removing observations based on metric thresholds in Python5.4.3 Removing zero measurements from rare metric analyses5.4.4 Disengaging behaviors: Metrics associated with increasing churn5.5 Segmenting customers by using cohort analysis5.5.1 Segmenting process5.5.2 Choosing segment criteriaSummary
6 Relationships between customer behaviors
6.1 Correlation between behaviors6.1.1 Correlation between pairs of metrics6.1.2 Investigating correlations with Python6.1.3 Understanding correlations between sets of metrics with correlation matrices6.1.4 Case study correlation matrices6.1.5 Calculating correlation matrices in Python6.2 Averaging groups of behavioral metrics6.2.1 Why you average correlated metric scores6.2.2 Averaging scores with a matrix of weights (loading matrix)6.2.3 Case study for loading matrices6.2.4 Applying a loading matrix in Python6.2.5 Churn cohort analysis on metric group average scores6.3 Discovering groups of correlated metrics6.3.1 Grouping metrics by clustering correlations6.3.2 Clustering correlations in Python6.3.3 Loading matrix weights that make the average of scores a score6.3.4 Running the metric grouping and grouped cohort analysis listings6.3.5 Picking the correlation threshold for clustering6.4 Explaining correlated metric groups to businesspeopleSummary
7 Segmenting customers with advanced metrics
7.1 Ratio metrics7.1.1 When to use ratio metrics and why7.1.2 How to calculate ratio metrics7.1.3 Ratio metric case study examples7.1.4 Additional ratio metrics for the simulated social network7.2 Percentage of total metrics7.2.1 Calculating percentage of total metrics7.2.2 Percentage of total metric case study with two metrics7.2.3 Percentage of total metrics case study with multiple metrics7.3 Metrics that measure change7.3.1 Measuring change in the level of activity7.3.2 Scores for metrics with extreme outliers (fat tails)7.3.3 Measuring the time since the last activity7.4 Scaling metric time periods7.4.1 Scaling longer metrics to shorter quoting periods7.4.2 Estimating metrics for new accounts7.5 User metrics7.5.1 Measuring active users7.5.2 Active user metrics7.6 Which ratios to use7.6.1 Why use ratios, and what else is there?7.6.2 Which ratios to use?Summary
Part 3. Special weapons and tactics
8 Forecasting churn
8.1 Forecasting churn with a model8.1.1 Probability forecasts with a model8.1.2 Engagement and retention probability8.1.3 Engagement and customer behavior8.1.4 An offset matches observed churn rates to the S curve8.1.5 The logistic regression probability calculation8.2 Reviewing data preparation8.3 Fitting a churn model8.3.1 Results of logistic regression8.3.2 Logistic regression code8.3.3 Explaining logistic regression results8.3.4 Logistic regression case study8.3.5 Calibration and historical churn probabilities8.4 Forecasting churn probabilities8.4.1 Preparing the current customer dataset for forecasting8.4.2 Preparing the current customer data for segmenting8.4.3 Forecasting with a saved model8.4.4 Forecasting case studies8.4.5 Forecast calibration and forecast drift8.5 Pitfalls of churn forecasting8.5.1 Correlated metrics8.5.2 Outliers8.6 Customer lifetime value8.6.1 The meaning(s) of CLV8.6.2 From churn to expected customer lifetime8.6.3 CLV formulasSummary
9 Forecast accuracy and machine learning
9.1 Measuring the accuracy of churn forecasts9.1.1 Why you don’t use the standard accuracy measurement for churn9.1.2 Measuring churn forecast accuracy with the AUC9.1.3 Measuring churn forecast accuracy with the lift9.2 Historical accuracy simulation: Backtesting9.2.1 What and why of backtesting9.2.2 Backtesting code9.2.3 Backtesting considerations and pitfalls9.3 The regression control parameter9.3.1 Controlling the strength and number of regression weights9.3.2 Regression with the control parameter9.4 Picking the regression parameter by testing (cross-validation)9.4.1 Cross-validation9.4.2 Cross-validation code9.4.3 Regression cross-validation case studies9.5 Forecasting churn risk with machine learning9.5.1 The XGBoost learning model9.5.2 XGBoost cross-validation9.5.3 Comparison of XGBoost accuracy to regression9.5.4 Comparison of advanced and basic metrics9.6 Segmenting customers with machine learning forecastsSummary
10 Churn demographics and firmographics
10.1 Demographic and firmographic datasets10.1.1 Types of demographic and firmographic data10.1.2 Account data model for the social network simulation10.1.3 Demographic dataset SQL10.2 Churn cohorts with demographic and firmographic categories10.2.1 Churn rate cohorts for demographic categories10.2.2 Churn rate confidence intervals10.2.3 Comparing demographic cohorts with confidence intervals10.3 Grouping demographic categories10.3.1 Representing groups with a mapping dictionary10.3.2 Cohort analysis with grouped categories10.3.3 Designing category groups10.4 Churn analysis for date- and numeric-based demographics10.5 Churn forecasting with demographic data10.5.1 Converting text fields to dummy variables10.5.2 Forecasting churn with categorical dummy variables alone10.5.3 Combining dummy variables with numeric data10.5.4 Forecasting churn with demographic and metrics combined10.6 Segmenting current customers with demographic dataSummary
11 Leading the fight against churn
11.1 Planning your own fight against churn11.1.1 Data processing and analysis checklist11.1.2 Communication to the business checklist11.2 Running the book listings on your own data11.2.1 Loading your data into this book’s data schema11.2.2 Running the listings on your own data11.3 Porting this book’s listings to different environments11.3.1 Porting the SQL listings11.3.2 Porting the Python listings11.4 Learning more and keeping in touch11.4.1 Author’s blog site and social media11.4.2 Sources for churn benchmark information11.4.3 Other sources of information about churn11.4.4 Products that help with churnSummary
index

Overview

The beating heart of any product or service business is returning clients. Don't let your hard-won customers vanish, taking their money with them. In Fighting Churn with Data you'll learn powerful data-driven techniques to maximize customer retention and minimize actions that cause them to stop engaging or unsubscribe altogether. This hands-on guide is packed with techniques for converting raw data into measurable metrics, testing hypotheses, and presenting findings that are easily understandable to non-technical decision makers.

About the Technology
Keeping customers active and engaged is essential for any business that relies on recurring revenue and repeat sales. Customer turnover—or “churn”—is costly, frustrating, and preventable. By applying the techniques in this book, you can identify the warning signs of churn and learn to catch customers before they leave.

About the Book
Fighting Churn with Data teaches developers and data scientists proven techniques for stopping churn before it happens. Packed with real-world use cases and examples, this book teaches you to convert raw data into measurable behavior metrics, calculate customer lifetime value, and improve churn forecasting with demographic data. By following Zuora Chief Data Scientist Carl Gold’s methods, you’ll reap the benefits of high customer retention.

What's Inside

Calculating churn metrics
Identifying user behavior that predicts churn
Using churn reduction tactics with customer segmentation
Applying churn analysis techniques to other business areas
Using AI for accurate churn forecasting

About the Reader
For readers with basic data analysis skills, including Python and SQL.

About the Author
Carl Gold is a Senior Data Science Manager for financial startup Migo.money. He has previously worked as Chief Data Scientist for Zuora, the industry-leading subscription management platform. He has a Ph.D. from the California Institute of Technology.

Quotes
This book is a rarity. Lucid, compelling, and even funny. Mandatory reading for anyone running a subscription-based business. Buy a copy for your boss.
- From the Foreword by Tien Tzuo, Founder and CEO of Zuora, Inc.

A must-have weapon. . . . This comprehensive guide provides deep insights on churn analysis with step-by-step examples.
- Kelum Prabath Senanayake, Echoworx

A great exploration of churn, richly packed with theory and great code samples.
- George Thomas, Manhattan Associates

Churns out almost everything related to churn. Packed with lucid language, detailed explanations, and scrutiny of a real-life case study.
- Prabhuti Prakash, Synechro

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Fighting Churn Churn Forecasting: XGBoost ML forecasting and analysis

Publisher Resources

ISBN: 9781617296529Publisher Support Publisher Website Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills