book

Causal Inference and Discovery in Python

Name: Causal Inference and Discovery in Python
Author: Aleksander Molak
ISBN: 9781804612989

by Aleksander Molak

May 2023

Intermediate to advanced

466 pages

13h 2m

English

Packt Publishing

Read now

Unlock full access

Causal Inference and Discovery in Python – Machine Learning and Pearlian Perspective
Foreword
Contributors
About the author
About the reviewers
Acknowledgments
Preface
Who this book is forWhat this book coversTo get the most out of this bookDownload the example code filesConventions usedGet in touchShare Your ThoughtsJoin our book's Discord space
Part 1: Causality – an Introduction
Chapter 1: Causality – Hey, We Have Machine Learning, So Why Even Bother?
Getting the most out of this book – get to know your free benefits Interactive AI assistant (beta)DRM-free PDF or ePub version A brief history of causalityWhy causality? Ask babies!Interacting with the worldConfounding – relationships that are not realHow not to lose money… and human livesA marketer’s dilemmaLet’s play doctor!Associations in the wildWrapping it upReferencesJoin our book's Discord space
Chapter 2: Judea Pearl and the Ladder of Causation
From associations to logic and imagination – the Ladder of CausationAssociationsLet’s practice!What are interventions?Changing the worldCorrelation and causationWhat are counterfactuals?Let’s get weird (but formal)!The fundamental problem of causal inferenceComputing counterfactualsTime to code!Extra – is all machine learning causally the same?Causality and reinforcement learningCausality and semi-supervised and unsupervised learningWrapping it upReferences
Chapter 3: Regression, Observations, and Interventions
Starting simple – observational data and linear regressionLinear regressionp-values and statistical significanceGeometric interpretation of linear regressionReversing the orderShould we always control for all available covariates?Navigating the mazeIf you don’t know where you’re going, you might end up somewhere elseGet involved!To control or not to control?Regression and structural modelsSCMsLinear regression versus SCMsFinding the linkRegression and causal effectsWrapping it upReferences

Chapter 4: Graphical Models
Graphs, graphs, graphsTypes of graphsGraph representationsGraphs in PythonWhat is a graphical model?DAG your pardon? Directed acyclic graphs in the causal wonderlandDefinitions of causalityDAGs and causalityLet’s get formal!Limitations of DAGsSources of causal graphs in the real worldCausal discoveryExpert knowledgeCombining causal discovery and expert knowledgeExtra – is there causality beyond DAGs?Dynamical systemsCyclic SCMsWrapping it upReferences
Chapter 5: Forks, Chains, and Immoralities
Graphs and distributions and how to map between themHow to talk about independenceChoosing the right directionConditions and assumptionsChains, forks, and colliders or…immoralitiesA chain of eventsChainsForksColliders, immoralities, or v-structuresAmbiguous casesForks, chains, colliders, and regressionGenerating the chain datasetGenerating the fork datasetGenerating the collider datasetFitting the regression modelsWrapping it upReferencesJoin our book's Discord space
Part 2: Causal Inference
Chapter 6: Nodes, Edges, and Statistical (In)dependence
You’re gonna keep ‘em d-separatedPractice makes perfect – d-separationEstimand first!We live in a world of estimatorsSo, what is an estimand?The back-door criterionWhat is the back-door criterion?Back-door and equivalent estimandsThe front-door criterionCan GPS lead us astray?London cabbies and the magic pebbleOpening the front doorThree simple steps toward the front doorFront-door in practiceAre there other criteria out there? Let’s do-calculus!The three rules of do-calculusInstrumental variablesWrapping it upAnswerReferences
Chapter 7: The Four-Step Process of Causal Inference
Introduction to DoWhy and EconMLPython causal ecosystemWhy DoWhy?Oui, mon ami, but what is DoWhy?How about EconML?Step 1 – modeling the problemCreating the graphBuilding a CausalModel objectStep 2 – identifying the estimand(s)Step 3 – obtaining estimatesStep 4 – where’s my validation set? Refutation testsHow to validate causal modelsIntroduction to refutation testsFull exampleStep 1 – encode the assumptionsStep 2 – getting the estimandStep 3 – estimate!Step 4 – refute them!Wrapping it upReferencesJoin our book's Discord space
Chapter 8: Causal Models – Assumptions and Challenges
I am the king of the world! But am I?In betweenIdentifiabilityLack of causal graphsNot enough dataUnverifiable assumptionsAn elephant in the room – hopeful or hopeless?Let’s eat the elephantPositivityExchangeabilityExchangeable subjectsExchangeability versus confounding…and moreModularitySUTVAConsistencyCall me names – spurious relationships in the wildNames, names, namesShould I ask you or someone who’s not here?DAG them!More selection biasWrapping it upReferences
Chapter 9: Causal Inference and Machine Learning – from Matching to Meta-Learners
The basics I – matchingTypes of matchingTreatment effects – ATE versus ATT/ATCMatching estimatorsImplementing matchingThe basics II – propensity scoresMatching in the wildReducing the dimensionality with propensity scoresPropensity score matching (PSM)Inverse probability weighting (IPW)Many faces of propensity scoresFormalizing IPWImplementing IPWIPW – practical considerationsS-Learner – the Lone RangerThe devil’s in the detailMom, Dad, meet CATEJokes aside, say hi to the heterogeneous crowdWaving the assumptions flagYou’re the only one – modeling with S-LearnerSmall dataS-Learner’s vulnerabilitiesT-Learner – together we can do moreForcing the split on treatmentT-Learner in four steps and a formulaImplementing T-LearnerX-Learner – a step furtherSqueezing the lemonReconstructing the X-LearnerX-Learner – an alternative formulationImplementing X-LearnerWrapping it upReferences
Chapter 10: Causal Inference and Machine Learning – Advanced Estimators, Experiments, Evaluations, and More
Doubly robust methods – let’s get more!Do we need another thing?Doubly robust is not equal to bulletproof……but it can bring a lot of valueThe secret doubly robust sauceDoubly robust estimator versus assumptionsDR-Learner – crossing the chasmDR-Learners – more optionsTargeted maximum likelihood estimatorIf machine learning is cool, how about double machine learning?Why DML and what’s so double about it?DML with DoWhy and EconMLHyperparameter tuning with DoWhy and EconMLIs DML a golden bullet?Doubly robust versus DMLWhat’s in it for me?Causal Forests and moreCausal treesForests overflowAdvantages of Causal ForestsCausal Forest with DoWhy and EconMLHeterogeneous treatment effects with experimental data – the uplift odysseyThe dataChoosing the frameworkWe don’t know half of the storyKevin’s challengeOpening the toolboxUplift models and performanceOther metrics for continuous outcomes with multiple treatmentsConfidence intervalsKevin’s challenge’s winning submissionWhen should we use CATE estimators for experimental data?Model selection – a simplified guideExtra – counterfactual explanationsBad faith or tech that does not know?Wrapping it upReferences
Chapter 11: Causal Inference and Machine Learning – Deep Learning, NLP, and Beyond
Going deeper – deep learning for heterogeneous treatment effectsCATE goes deeperSNetTransformers and causal inferenceThe theory of meaning in five paragraphsMaking computers understand languageFrom philosophy to Python codeLLMs and causalityThe three scenariosCausalBertCausality and time series – when an econometrician goes BayesianQuasi-experimentsTwitter acquisition and our googling patternsThe logic of synthetic controlsA visual introduction to the logic of synthetic controlsStarting with the dataSynthetic controls in codeChallengesWrapping it upReferences
Part 3: Causal Discovery
Chapter 12: Can I Have a Causal Graph, Please?
Sources of causal knowledgeYou and I, oversaturatedThe power of a surpriseScientific insightsThe logic of scienceHypotheses are a speciesOne logic, many waysControlled experimentsRandomized controlled trials (RCTs)From experiments to graphsSimulationsPersonal experience and domain knowledgePersonal experiencesDomain knowledgeCausal structure learningWrapping it upReferencesJoin our book's Discord space
Chapter 13: Causal Discovery and Machine Learning – from Assumptions to Applications
Causal discovery – assumptions refresherGearing upAlways trying to be faithful……but it’s difficult sometimesMinimalism is a virtueThe four (and a half) familiesThe four streamsIntroduction to gCastleHello, gCastle!Synthetic data in gCastleFitting your first causal discovery modelVisualizing the modelModel evaluation metricsConstraint-based causal discoveryConstraints and independenceLeveraging the independence structure to recover the graphPC algorithm – hidden challengesPC algorithm for categorical dataScore-based causal discoveryTabula rasa – starting freshGES – scoringGES in gCastleFunctional causal discoveryThe blessings of asymmetryANM modelAssessing independenceLiNGAM timeGradient-based causal discoveryWhat exactly is so gradient about you?Shed no tearsGOLEMs don’t cryThe comparisonEncoding expert knowledgeWhat is expert knowledge?Expert knowledge in gCastleWrapping it upReferences
Chapter 14: Causal Discovery and Machine Learning – Advanced Deep Learning and Beyond
Advanced causal discovery with deep learningFrom generative models to causalityLooking back to learn who you areDECI’s internal building blocksDECI in codeDECI is end-to-endCausal discovery under hidden confoundingThe FCI algorithmOther approaches to confounded dataExtra – going beyond observationsENCOABCICausal discovery – real-world applications, challenges, and open problemsWrapping it up!References
Chapter 15: Epilogue
What we’ve learned in this bookFive steps to get the best out of your causal projectStarting with a questionObtaining expert knowledgeGenerating hypothetical graph(s)Check identifiabilityFalsifying hypothesesCausality and businessHow causal doers go from vision to implementationToward the future of causal MLWhere are we now and where are we heading?Causal benchmarksCausal data fusionIntervening agentsCausal structure learningImitation learningLearning causalityLet’s stay in touchWrapping it upReferencesJoin our book's Discord space
Chapter 16: Unlock Your Book’s Exclusive Benefits
How to unlock these benefits in three easy stepsStep 1
Index
Why subscribe?
Join our book's Discord spaceOther Books You May EnjoyPackt is searching for authors like youShare Your Thoughts

Overview

Causal Inference and Discovery in Python is a comprehensive guide to understanding and applying causal inference concepts and techniques. With practical examples and Python implementations, this book enables you to master causal modeling and leverage data for advanced decision-making in machine learning and data analysis.

What this Book will help me do

Build a solid foundation in causal inference concepts, including structural causal models and their practical applications.
Learn to implement causal estimation techniques for assessing treatment effects using Python.
Understand and apply the four-step causal inference process through practical exercises and examples.
Explore advanced methodologies like uplift modeling and deep learning applications in causal inference.
Gain insights into causal discovery techniques and the future of causal artificial intelligence.

Author(s)

Aleksander Molak, the author of Causal Inference and Discovery in Python, is an experienced researcher and practitioner in the fields of machine learning and data science. He specializes in causal inference and machine learning applications, and in this book, he combines theoretical insights with practical Python tutorials to create an engaging and informative learning journey.

Who is it for?

This book is ideal for machine learning engineers, researchers, and data scientists looking to enhance their skills by incorporating causal inference into their work. It also suits individuals familiar with causal inference in other programming languages who wish to transition to Python, as well as beginners eager to dive into the world of causal AI and its applications.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781804612989

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills