Version 
Location 
Description 
Submitted By 
Date submitted 
Date corrected 

Page Confidence intervals
A couple of pages in 
On Kindle app, error in formula
Mistake: the factor...to get a 1alpha confidence interval is given by abs(ppf((1alpha)/2)))
Correction: the factor...to get a 1alpha confidence interval is given by abs(ppf(alpha/2))
Note from the Author or Editor: Right after “confidence interval is given by”, inside the formula, it should be ppf(alpha) instead of ppf(1alpha). The following code is correct and needs no modification.

Francis Doornaert 
Aug 17, 2023 

Printed 
Page Page 22, Section "A visual guide to Bias"
In the image right after "The reason for this is bias, which is depicted in the right plot:” 
On the image, on the leftmost curly braces, the equation should be
E[YT=1]  E[YT=0], not E[YT=1] = E[YT=0].

Matthew Facure 
Sep 01, 2023 


Page Chapter 3, Section "Conditioning on a Collider"
Local 3042 of Kindle version, first formula 
The formula states:
E[YT=1,R=1]  E[YT=1,R=1] = E[Y_1  Y_0R=1] + E[Y_0T=0,R=1]  E[Y_0T=1,R=1]
I think in the left side the second term should be E[TT=0,R=1] since it is the average difference between treated and no treated who responded to the survey.
Note from the Author or Editor: The formula should be
E[YT=1,R=1]  E[YT=0,R=1] = E[Y_1  Y_0R=1] + E[Y_0T=1,R=1]  E[Y_0T=0,R=1]

Felipe Frigeri 
Sep 07, 2023 


Page Chapter 11 RDD “The IV Estimate”
The code blocks 
Firstly, it appears that the cutoff value used in the code is 10k, while it should actually be 5k. This has a downstream effect on the regression models and, consequently, the calculated LATE.
Secondly, the code implies that the ITTE can be directly derived from the conditional coefficient for the intercept in the linear regression. This would be a valid approach if the cutoff were at 0, but it's actually at 5k. This simplification seems to contradict the locality assumption of RDD, stating that the estimator is valid only near the threshold `R=c`.
Note from the Author or Editor: In the section Intention to Treat Effect (pg 356 of the printed book), the paragraph right after the table should be updated to:
"Then, let's center the running variable, balance, to shift the threshold to zero. In this case, since the discontinuity is at 5000, you can do this by subtracting 5000 from the balance variable. (This is just a trick to make interpreting the regression parameters easier). Next, you need to regress the outcome variable on the centered running variable R interacted with a dummy for being above the threshold (R > 0):
y_i = \beta_0 + \beta_1 r_i + \beta_2 \mathbb{1}\{r_i>0\} + \beta_3 \mathbb{1}\{r_i>0\} r_i
The parameter estimate associated with crossing the threshold…"
Also, code block 20 should be:
m = smf.ols(f"pv~balance*I(balance>0)",
df_dd.assign(balance=lambda d: d["balance"]5000)).fit()
m.summary().tables[1]
And table resulting from this code should be as in the updated code, cell 25:
https://github.com/matheusfacure/causalinferenceinpythoncode/blob/main/causalinferenceinpython/11NonComplianceandInstruments.ipynb
In the section The IV Estimate, code block 21 should be updated to
def rdd_iv(data, y, t, r, cutoff):
centered_df = data.assign(**{r: data[r]cutoff})
compliance = smf.ols(f"{t}~{r}*I({r}>0)", centered_df).fit()
itte = smf.ols(f"{y}~{r}*I({r}>0)", centered_df).fit()
param = f"I({r} > 0)[T.True]"
return itte.params[param]/compliance.params[param]
rdd_iv(df_dd, y="pv", t="prime_card", r="balance", cutoff=5000)
The result from this code block should also be updated to 732.8534752298891. See code block 27 in the GitHub link above.
Finally, the array just before the Bunching section should be updated to array([655.08214249, 807.83207567]). See code block 30 in the GitHub link above.

Alex Roy 
Oct 30, 2023 


Page 36
2nd Paragraph 
The woman and man values should be switched to make sense with the rest of the paragraph.
"When you look at age, treatment groups seem very much alike, but there seems to be a difference in gender (woman = 1, man = 0)."
Note from the Author or Editor: It should be "(woman = 0, man = 1)".

Clayton Schoeny 
Jul 24, 2023 


Page 42
1st Equation 
In the equation for the estimate of the standard deviation, the summation should start at i=1, not i=0.
Note from the Author or Editor: In the equation, it should be i=1, not i=0.

Clayton Schoeny 
Jul 24, 2023 


Page 48
Practical Example 
The equation following "They report the efficacy of the vaccine," is not correct. It's printed as as E[YT = 0] / E[YT = 1], but this would give us a value of 56.5/3.3 = 17.12.
Rather, one way to correctly write the equation is 1  (E[YT = 1] / E[YT = 0]).
Note from the Author or Editor: The equation after "They report the efficacy of the vaccine" should be 1  (E[YT = 1] / E[YT = 0]).

Clayton Schoeny 
Jul 31, 2023 

Printed 
Page 58
Code cell 19 
Missing a **2 in the code “np.ceil(16 * no_email.std()**2/0.01)”. It should be
“np.ceil(16 * no_email.std()**2/0.01**2)”, however, this gives a number too that is to large to go well with what is written around this code. A better solution is to replace the detectable difference from 1% to 8%.
“So, if you want to craft a crosssell email experiment where you want to detect a 8% difference, like the one you saw in this conversion email example, you must have a sample size that gives you at least 8% = 2.8SE.
[...]
In [19]: np.ceil(16 * (no_email.std()/0.08)**2)
Out[19]: 103.0
"

Matthew Facure 
Sep 01, 2023 

Printed 
Page 60
Last equation in the chapter. 
In the equation right after “you could simplify the sample size formula to:”, there is a ^2 missing. It is
N = 16 * σ^2/δ
but it should be
N = 16 * σ^2/δ^2.
The correct equation can be found at page 58.

Matthew Facure 
Sep 01, 2023 

Printed 
Page 97
“It projects all the X variables into the outcome dimension and makes the comparison between treatment and control on that projection.” 
It should be “It projects the outcome variable into the X variables and makes the comparison between treatment and control on that projection.”

Matthew Facure 
Sep 01, 2023 


Page 151 (Conditioning on a collider)
After first paragraph 
Left hand side of the formula contains an error which has already been submitted as an erratum by another reader (Felipe Frigeri)
But there is also an error in the right hand side, in the SelectionBias collection of terms:
E[Y_0T=0, R=1]  E[Y_0T=1, R=1]
should be corrected to
E[Y_0T=1, R=1]  E[Y_0T=0, R=1]
Note from the Author or Editor: The right most term, above SelectionBias, should be E[Y_0T=1, R=1]  E[Y_0T=0, R=1]

Francis Doornaert 
Sep 14, 2023 


Page 433
Multiple Cohorts charts or code block 
The example description and code snippet says the data is subset to the West region, but the example charts are labeled Multiple Cohorts  North Region
Note from the Author or Editor: The 1st Plot in the Staggered Adoption section should read West instead of North. This was already fixed in the book's code, cell 42.
https://github.com/matheusfacure/causalinferenceinpythoncode/blob/main/causalinferenceinpython/08DifferenceinDifferences.ipynb

Kara Downey 
Sep 14, 2023 
