Errata

Errata for Think Stats

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date submitted
Printed	Page 4 Fifth paragraph	The example in the previous paragraph lists the variable 'pregordr', however the variable is spelled incorrectly in the following paragraph: "pregorder is a one-byte integer..." Note the extra "e".	Scott Breitbach	Sep 13, 2020
PDF	Page 40 5th paragram, in code sample,	in PDF ver 1.6, P40, the following line won't work. In existing maypole.py (by 3/19/2015), xscale and yscale have never been set to pyplot. The alternative way is to call myplot.Config(xscale='log', ysacle='log'). myplot.Cdf(cdf, complement=True, xscale='linear', yscale='log')	Bo Dong	Mar 20, 2015
PDF	Page 41 last sentence of the 1st paragrah	It says: By contrast, the exponential distribution with median 1 has 95th per- centile of only 1.5. I think the percentile should be around 4.3. Please confirm, thanks. For exponential distribution, median = ln(2)/lam, so when median=1 lam is ln(2). The output of the following code snippet is : expo mean is 1.000050, percentile 95% is 4.361694 def CDFExpo(lam, point_n=1000, name='CDF Expo'): xs= sorted([random.expovariate(lam) for i in xrange(point_n)]) ys= [1- pow(math.e, -lam * x) for x in xs] return Cdf.Cdf(xs,ys,name) lam = math.log(2) mean, percentile = 0.5, 0.95 cdf = CDFExpo(lam) print 'expo mean is %f, percentile 95%% is %f' % (cdf.Value(mean), cdf.Value(percentile))	Bo Dong	Mar 19, 2015
Printed	Page 99 Third formula (for \rho)	The denominator in the formula for \rho as written is identically equal to 0. This is because the deviations of both x and of y sum to zero, i.e. 0*0 = 0. To remedy, replace each of the sums of deviations in the current version of the denominator with the formula for the corresponding standard deviation: \rho = \frac{\sum dx_i dy_i}{ \sqrt{\sum dx_i^2} \sqrt{\sum dy_i^2}}. Cheers, Stephen Stohs PS Thanks for creating the great, thin intro stats text!	Stephen Stohs	Apr 06, 2019
Printed	Page 104 Residual formula	While not mathematically incorrect, especially in the case of squared residuals, the sign on your residual formula is opposite of the way it is normally defined, including in your definition of deviations on p. 98. (Note that a regression residual is the deviation of a data point from the regression line, which is the conditional mean of y given a value of x.) The universal convention is to define deviations from the mean as positive for data values above the mean and negative otherwise. To remedy this, revise to: \varepsilon_i = y_i - (\alpha + \beta x_i).	Stephen Stohs	Apr 06, 2019
Printed	Page 110 Method ChiSquared in Class PregLenghtTest	Wonderful Book! I would need just an explanation of following code: def ChiSquared … expected = self.expected_probs * len(lenghts)) Whay are the expected_probs multiplied for the number of all entries in the corresponding interval? To get a probability I would do so: pmf(35) * number_entries_in(35) + ... pmf(44) * number_entries_in(44) Thank you for your help! Fabio	Fabio	Mar 06, 2017