Causal inference 101: Answering the crucial "why" in your analysis

by Subhasish Misra

Released February 2020

Publisher(s): O'Reilly Media, Inc.

ISBN: 0636920371984

Start your free trial

Video description

Causal questions are ubiquitous in data science. For example, you may have questions that are deeply rooted in causality, such as whether or not changing a feature on a website led to more traffic or if digital ad exposure led to incremental purchase, did changing a feature in a website lead to more traffic or if digital ad exposure led to incremental purchase.

Randomized tests are considered to be the gold standard when addressing causal effects; however, in many cases experiments are unfeasible or unethical. In such cases, you have to rely on observational (nonexperimental)data to derive causal insights. The crucial difference between randomized experiments and observational data is that in the former, test subjects (e.g., the customers) are randomly assigned a treatment (e.g., digital advertisement exposure). This helps curb the possibility that user response (e.g., clicking on a link in the ad and purchasing the product) across the two groups of treated and nontreated subjects is different because of preexisting differences in user characteristic (e.g., demographics, geolocation, etc.) In essence, you can then attribute divergences observed posttreatment in key outcomes (e.g., purchase rate) as the causal impact of the treatment. But this treatment assignment mechanism is absent when using observational data.

Subhasish Misra (Walmart Labs) explores the statistical methods available to ensure you’re able to circumvent this shortcoming and get to causality. You’ll get a practical overview of the aspects of causal inference, including the fundamental tenants of causality and measuring causal effects; the challenges involved in measuring causal effects in real-world situations; distinguishing between randomized and observational measurement approaches; an introduction to measuring casual effects with observational data using matching and its extension of propensity score-based matching with a focus on the institution and statistics behind it, tips from the trenches based on Subhasish’s experience with these techniques; and practical limitations of these approaches. Subhashish walks you through an example of how matching was applied to get causal insights regarding the effectiveness of a digital product at Walmart.

Prerequisite knowledge

A basic understanding of machine statistics and data science

What you'll learn

Discover the fundamental nuances of causal inference and analytical frameworks and implementation tools to tease out causal effects in the wild—when randomization isn't an option
Understand the differences between randomized and observational studies and the challenges in getting to causal conclusions for each

This session is from the 2019 O'Reilly Strata Conference in New York, NY.