Causal Inference and Discovery in Python

Book description

Demystify causal inference and casual discovery by uncovering causal principles and merging them with powerful machine learning algorithms for observational and experimental data Purchase of the print or Kindle book includes a free PDF eBook

Key Features

  • Examine Pearlian causal concepts such as structural causal models, interventions, counterfactuals, and more
  • Discover modern causal inference techniques for average and heterogenous treatment effect estimation
  • Explore and leverage traditional and modern causal discovery methods

Book Description

Causal methods present unique challenges compared to traditional machine learning and statistics. Learning causality can be challenging, but it offers distinct advantages that elude a purely statistical mindset. Causal Inference and Discovery in Python helps you unlock the potential of causality.

You’ll start with basic motivations behind causal thinking and a comprehensive introduction to Pearlian causal concepts, such as structural causal models, interventions, counterfactuals, and more. Each concept is accompanied by a theoretical explanation and a set of practical exercises with Python code. Next, you’ll dive into the world of causal effect estimation, consistently progressing towards modern machine learning methods. Step-by-step, you’ll discover Python causal ecosystem and harness the power of cutting-edge algorithms. You’ll further explore the mechanics of how “causes leave traces” and compare the main families of causal discovery algorithms. The final chapter gives you a broad outlook into the future of causal AI where we examine challenges and opportunities and provide you with a comprehensive list of resources to learn more.

By the end of this book, you will be able to build your own models for causal inference and discovery using statistical and machine learning techniques as well as perform basic project assessment.

What you will learn

  • Master the fundamental concepts of causal inference
  • Decipher the mysteries of structural causal models
  • Unleash the power of the 4-step causal inference process in Python
  • Explore advanced uplift modeling techniques
  • Unlock the secrets of modern causal discovery using Python
  • Use causal inference for social impact and community benefit

Who this book is for

This book is for machine learning engineers, researchers, and data scientists looking to extend their toolkit and explore causal machine learning. It will also help people who’ve worked with causality using other programming languages and now want to switch to Python, those who worked with traditional causal inference and want to learn about causal machine learning, and tech-savvy entrepreneurs who want to go beyond the limitations of traditional ML. You are expected to have basic knowledge of Python and Python scientific libraries along with knowledge of basic probability and statistics.

Table of contents

  1. Causal Inference and Discovery in Python – Machine Learning and Pearlian Perspective
  2. Foreword
  3. Contributors
  4. About the author
  5. About the reviewers
  6. Acknowledgments
  7. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Conventions used
    6. Get in touch
    7. Share Your Thoughts
    8. Join our book's Discord space
    9. Download a free PDF copy of this book
  8. Part 1: Causality – an Introduction
  9. Chapter 1: Causality – Hey, We Have Machine Learning, So Why Even Bother?
    1. A brief history of causality
    2. Why causality? Ask babies!
      1. Interacting with the world
      2. Confounding – relationships that are not real
    3. How not to lose money… and human lives
      1. A marketer’s dilemma
      2. Let’s play doctor!
      3. Associations in the wild
    4. Wrapping it up
    5. References
    6. Join our book's Discord space
  10. Chapter 2: Judea Pearl and the Ladder of Causation
    1. From associations to logic and imagination – the Ladder of Causation
    2. Associations
      1. Let’s practice!
    3. What are interventions?
      1. Changing the world
      2. Correlation and causation
    4. What are counterfactuals?
      1. Let’s get weird (but formal)!
      2. The fundamental problem of causal inference
      3. Computing counterfactuals
      4. Time to code!
    5. Extra – is all machine learning causally the same?
      1. Causality and reinforcement learning
      2. Causality and semi-supervised and unsupervised learning
    6. Wrapping it up
    7. References
  11. Chapter 3: Regression, Observations, and Interventions
    1. Starting simple – observational data and linear regression
      1. Linear regression
      2. p-values and statistical significance
      3. Geometric interpretation of linear regression
      4. Reversing the order
    2. Should we always control for all available covariates?
      1. Navigating the maze
      2. If you don’t know where you’re going, you might end up somewhere else
      3. Get involved!
      4. To control or not to control?
    3. Regression and structural models
      1. SCMs
      2. Linear regression versus SCMs
      3. Finding the link
      4. Regression and causal effects
    4. Wrapping it up
    5. References
  12. Chapter 4: Graphical Models
    1. Graphs, graphs, graphs
      1. Types of graphs
      2. Graph representations
      3. Graphs in Python
    2. What is a graphical model?
    3. DAG your pardon? Directed acyclic graphs in the causal wonderland
      1. Definitions of causality
      2. DAGs and causality
      3. Let’s get formal!
      4. Limitations of DAGs
    4. Sources of causal graphs in the real world
      1. Causal discovery
      2. Expert knowledge
      3. Combining causal discovery and expert knowledge
    5. Extra – is there causality beyond DAGs?
      1. Dynamical systems
      2. Cyclic SCMs
    6. Wrapping it up
    7. References
  13. Chapter 5: Forks, Chains, and Immoralities
    1. Graphs and distributions and how to map between them
      1. How to talk about independence
      2. Choosing the right direction
      3. Conditions and assumptions
    2. Chains, forks, and colliders or…immoralities
      1. A chain of events
      2. Chains
      3. Forks
      4. Colliders, immoralities, or v-structures
      5. Ambiguous cases
    3. Forks, chains, colliders, and regression
      1. Generating the chain dataset
      2. Generating the fork dataset
      3. Generating the collider dataset
      4. Fitting the regression models
    4. Wrapping it up
    5. References
    6. Join our book's Discord space
  14. Part 2: Causal Inference
  15. Chapter 6: Nodes, Edges, and Statistical (In)dependence
    1. You’re gonna keep ‘em d-separated
      1. Practice makes perfect – d-separation
    2. Estimand first!
      1. We live in a world of estimators
      2. So, what is an estimand?
    3. The back-door criterion
      1. What is the back-door criterion?
      2. Back-door and equivalent estimands
    4. The front-door criterion
      1. Can GPS lead us astray?
      2. London cabbies and the magic pebble
      3. Opening the front door
      4. Three simple steps toward the front door
      5. Front-door in practice
    5. Are there other criteria out there? Let’s do-calculus!
      1. The three rules of do-calculus
      2. Instrumental variables
    6. Wrapping it up
    7. Answer
    8. References
  16. Chapter 7: The Four-Step Process of Causal Inference
    1. Introduction to DoWhy and EconML
      1. Python causal ecosystem
      2. Why DoWhy?
      3. Oui, mon ami, but what is DoWhy?
      4. How about EconML?
    2. Step 1 – modeling the problem
      1. Creating the graph
      2. Building a CausalModel object
    3. Step 2 – identifying the estimand(s)
    4. Step 3 – obtaining estimates
    5. Step 4 – where’s my validation set? Refutation tests
      1. How to validate causal models
      2. Introduction to refutation tests
    6. Full example
      1. Step 1 – encode the assumptions
      2. Step 2 – getting the estimand
      3. Step 3 – estimate!
      4. Step 4 – refute them!
    7. Wrapping it up
    8. References
    9. Join our book's Discord space
  17. Chapter 8: Causal Models – Assumptions and Challenges
    1. I am the king of the world! But am I?
      1. In between
      2. Identifiability
      3. Lack of causal graphs
      4. Not enough data
      5. Unverifiable assumptions
      6. An elephant in the room – hopeful or hopeless?
      7. Let’s eat the elephant
    2. Positivity
    3. Exchangeability
      1. Exchangeable subjects
      2. Exchangeability versus confounding
    4. …and more
      1. Modularity
      2. SUTVA
      3. Consistency
    5. Call me names – spurious relationships in the wild
      1. Names, names, names
      2. Should I ask you or someone who’s not here?
      3. DAG them!
      4. More selection bias
    6. Wrapping it up
    7. References
  18. Chapter 9: Causal Inference and Machine Learning – from Matching to Meta-Learners
    1. The basics I – matching
      1. Types of matching
      2. Treatment effects – ATE versus ATT/ATC
      3. Matching estimators
      4. Implementing matching
    2. The basics II – propensity scores
      1. Matching in the wild
      2. Reducing the dimensionality with propensity scores
      3. Propensity score matching (PSM)
    3. Inverse probability weighting (IPW)
      1. Many faces of propensity scores
      2. Formalizing IPW
      3. Implementing IPW
      4. IPW – practical considerations
    4. S-Learner – the Lone Ranger
      1. The devil’s in the detail
      2. Mom, Dad, meet CATE
      3. Jokes aside, say hi to the heterogeneous crowd
      4. Waving the assumptions flag
      5. You’re the only one – modeling with S-Learner
      6. Small data
      7. S-Learner’s vulnerabilities
    5. T-Learner – together we can do more
      1. Forcing the split on treatment
      2. T-Learner in four steps and a formula
      3. Implementing T-Learner
    6. X-Learner – a step further
      1. Squeezing the lemon
      2. Reconstructing the X-Learner
      3. X-Learner – an alternative formulation
      4. Implementing X-Learner
    7. Wrapping it up
    8. References
  19. Chapter 10: Causal Inference and Machine Learning – Advanced Estimators, Experiments, Evaluations, and More
    1. Doubly robust methods – let’s get more!
      1. Do we need another thing?
      2. Doubly robust is not equal to bulletproof…
      3. …but it can bring a lot of value
      4. The secret doubly robust sauce
      5. Doubly robust estimator versus assumptions
      6. DR-Learner – crossing the chasm
      7. DR-Learners – more options
      8. Targeted maximum likelihood estimator
    2. If machine learning is cool, how about double machine learning?
      1. Why DML and what’s so double about it?
      2. DML with DoWhy and EconML
      3. Hyperparameter tuning with DoWhy and EconML
      4. Is DML a golden bullet?
      5. Doubly robust versus DML
      6. What’s in it for me?
    3. Causal Forests and more
      1. Causal trees
      2. Forests overflow
      3. Advantages of Causal Forests
      4. Causal Forest with DoWhy and EconML
    4. Heterogeneous treatment effects with experimental data – the uplift odyssey
      1. The data
      2. Choosing the framework
      3. We don’t know half of the story
      4. Kevin’s challenge
      5. Opening the toolbox
      6. Uplift models and performance
      7. Other metrics for continuous outcomes with multiple treatments
      8. Confidence intervals
      9. Kevin’s challenge’s winning submission
      10. When should we use CATE estimators for experimental data?
      11. Model selection – a simplified guide
    5. Extra – counterfactual explanations
      1. Bad faith or tech that does not know?
    6. Wrapping it up
    7. References
  20. Chapter 11: Causal Inference and Machine Learning – Deep Learning, NLP, and Beyond
    1. Going deeper – deep learning for heterogeneous treatment effects
      1. CATE goes deeper
      2. SNet
    2. Transformers and causal inference
      1. The theory of meaning in five paragraphs
      2. Making computers understand language
      3. From philosophy to Python code
      4. LLMs and causality
      5. The three scenarios
      6. CausalBert
    3. Causality and time series – when an econometrician goes Bayesian
      1. Quasi-experiments
      2. Twitter acquisition and our googling patterns
      3. The logic of synthetic controls
      4. A visual introduction to the logic of synthetic controls
      5. Starting with the data
      6. Synthetic controls in code
      7. Challenges
    4. Wrapping it up
    5. References
  21. Part 3: Causal Discovery
  22. Chapter 12: Can I Have a Causal Graph, Please?
    1. Sources of causal knowledge
      1. You and I, oversaturated
      2. The power of a surprise
    2. Scientific insights
      1. The logic of science
      2. Hypotheses are a species
      3. One logic, many ways
      4. Controlled experiments
      5. Randomized controlled trials (RCTs)
      6. From experiments to graphs
      7. Simulations
    3. Personal experience and domain knowledge
      1. Personal experiences
      2. Domain knowledge
    4. Causal structure learning
    5. Wrapping it up
    6. References
    7. Join our book's Discord space
  23. Chapter 13: Causal Discovery and Machine Learning – from Assumptions to Applications
    1. Causal discovery – assumptions refresher
      1. Gearing up
      2. Always trying to be faithful…
      3. …but it’s difficult sometimes
      4. Minimalism is a virtue
    2. The four (and a half) families
      1. The four streams
    3. Introduction to gCastle
      1. Hello, gCastle!
      2. Synthetic data in gCastle
      3. Fitting your first causal discovery model
      4. Visualizing the model
      5. Model evaluation metrics
    4. Constraint-based causal discovery
      1. Constraints and independence
      2. Leveraging the independence structure to recover the graph
      3. PC algorithm – hidden challenges
      4. PC algorithm for categorical data
    5. Score-based causal discovery
      1. Tabula rasa – starting fresh
      2. GES – scoring
      3. GES in gCastle
    6. Functional causal discovery
      1. The blessings of asymmetry
      2. ANM model
      3. Assessing independence
      4. LiNGAM time
    7. Gradient-based causal discovery
      1. What exactly is so gradient about you?
      2. Shed no tears
      3. GOLEMs don’t cry
      4. The comparison
    8. Encoding expert knowledge
      1. What is expert knowledge?
      2. Expert knowledge in gCastle
    9. Wrapping it up
    10. References
  24. Chapter 14: Causal Discovery and Machine Learning – Advanced Deep Learning and Beyond
    1. Advanced causal discovery with deep learning
      1. From generative models to causality
      2. Looking back to learn who you are
      3. DECI’s internal building blocks
      4. DECI in code
      5. DECI is end-to-end
    2. Causal discovery under hidden confounding
      1. The FCI algorithm
      2. Other approaches to confounded data
    3. Extra – going beyond observations
      1. ENCO
      2. ABCI
    4. Causal discovery – real-world applications, challenges, and open problems
    5. Wrapping it up!
    6. References
  25. Chapter 15: Epilogue
    1. What we’ve learned in this book
    2. Five steps to get the best out of your causal project
      1. Starting with a question
      2. Obtaining expert knowledge
      3. Generating hypothetical graph(s)
      4. Check identifiability
      5. Falsifying hypotheses
    3. Causality and business
      1. How causal doers go from vision to implementation
    4. Toward the future of causal ML
      1. Where are we now and where are we heading?
      2. Causal benchmarks
      3. Causal data fusion
      4. Intervening agents
      5. Causal structure learning
      6. Imitation learning
    5. Learning causality
    6. Let’s stay in touch
    7. Wrapping it up
    8. References
    9. Join our book's Discord space
  26. Index
    1. Why subscribe?
    2. Join our book's Discord space
  27. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts
    3. Download a free PDF copy of this book

Product information

  • Title: Causal Inference and Discovery in Python
  • Author(s): Aleksander Molak
  • Release date: May 2023
  • Publisher(s): Packt Publishing
  • ISBN: 9781804612989