Errata

Errata for Applied Text Analysis with Python

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date submitted
	chapter 4 First code listing after heading "Pipeline Basics".	Here is the error when the code runs in Python: TypeError: Last step of Pipeline should implement fit. 'MultinomialNB()' (type <class 'str'>) doesn't Remove the two quote marks surrounding the call to MultinomialNB.	Anonymous	Mar 16, 2018
PDF	Page 13 4th paragraph	"Co-occurrences show which words are likely to proceed and succeed each other..." 'proceed' should be 'precede': "Co-occurrences show which words are likely to precede and succeed each other..."	Amar	Dec 24, 2018
Printed, PDF	Page 31 chapter 2	The errata pages here refer to completely different chapter names, and a different BOOK as far as I can tell. For example, in the real book chapter 2 is called "Building a Custom Corpus", not "Text Acquisition and Ingestion". I came here trying to understand why the extension in the DOC_PATTERN regex is json and not html in the "Reading an HTML Corpus". Anyway, the sample doesn't work.	Anonymous	Nov 06, 2018
Printed	Page 106 Model evaluation(Chinese version)	for X_train,X_test,y_train,y_test in loader: model.fit (X_train, y_train), what type of parameters are accepted here? ?. it is wrong to run according to this URL https://github.com/foxbook/atap/blob/master/snippets/ch05/ner.py	zeng sir	Jan 21, 2021
Printed	Page 143 4th paragraph	calls to grigram_counts.ngrams[3] and trigram_counts.ngrams[3].conditions() seem to be calling on an instance of FreqDist() when they should be calling on the instance of ConditionalFreqDist which is trigram_counts.allgrams. Looking at the nltk documentation the FreqDist doesn't have a conditions() method which makes me think the book is incorrect. Also the the text on the page says it is retrieving conditional frequency information. The code on page 142 seems to be recording that in trigram_counts.allgrams not trigram_counts.ngrams but the calls on page 143 retrieve it from trigram_counts.ngrams thank you	Anonymous	Jun 23, 2019