Errata

Errata for Natural Language Processing with Python

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted By	Date submitted	Date corrected
Printed	Page 1 .	The Natural Language Toolkit has been updated for Python 3.0, and the online version of the book has been updated. Please install NLTK 3 and consult the updated book (http://www.nltk.org/book) and user discussion forum (https://groups.google.com/group/nltk-users/) before reporting errata.	Anonymous	Oct 15, 2014
Printed	Page 9 16 lines down	lexical diversity() s/b lexical_diversity() -- with underscore instead of space	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 18 Axis label of plot	The source "fdist1.plot(50,cumulative=True) gives a Y axis in counts and the label is "Cumulative Counts" rather than a Y axis in percentage and a label of "Cumulative Percentage". This using Python 2.6.2 and the nltk and plotting packages downloaded on 9/13/09. Note from the Author or Editor: This has been addressed in second printing.	Bob Doherty	Sep 14, 2009	Jan 01, 2010
Printed	Page 18 Figure 1-4	Fig 1-4 or the code that creates it needs to be fixed (currently NLTK does counts, not percentages) Note from the Author or Editor: Already resolved on website and in second printing (December)	Anonymous	Dec 16, 2009	Jan 01, 2010
PDF	Page 29 Generating Language Output 2nd paragraph	"ils if the thieves are sold, and elle if the paintings are sold." This is not exactly wrong, still: "sold" should be replaced by "found" as "found" is used in the subsequent example and "selling" thieves is well...a little odd ;) Note from the Author or Editor: Agreed. In chapter 1, the sentence: if the thieves are sold, ... if the paintings are sold. Should be changed to: if the thieves are found, ... if the paintings are found.	Maximilian Scherr	Jan 15, 2011
PDF	Page 29 Generating Language Output 2nd paragraph	"ils if the thieves are sold, and elle if the paintings are sold." This is not exactly wrong, still: "sold" should be replaced by "found" as "found" is used in the subsequent example and "selling" thieves is well...a little odd ;) Note from the Author or Editor: Agreed. if the thieves are sold, ... if the paintings are sold. Should be changed to: if the thieves are found, ... if the paintings are found.	Maximilian Scherr	Jan 15, 2011
	46 first code block	Argument tuple to nltk.ConditionalFreqDist should be (target,fileid[:4]) not (target,file[:4]) . Note from the Author or Editor: Please see: http://code.google.com/p/nltk/issues/detail?id=417 Already fixed in online version	Andrew C Young	Jul 08, 2009
Printed	Page 46 Figure 2.1	More contrast (supplied image was color)	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed, PDF, , Other Digital Version	Page 83 n/a	In the section on processing rss feeds, a line reads: >>> nltk.word_tokenize(nltk.html_clean(content)) However, html_clean should be clean_html	Steven Bird	Nov 17, 2010
Printed, PDF, , Other Digital Version	Page 83 n/a	In the section on processing rss feeds, a line reads: >>> nltk.word_tokenize(nltk.html_clean(content)) However, html_clean should be clean_html	Steven Bird	Nov 17, 2010
Printed	Page 88 "Your turn" code example, bottom of page	"for line in b: print b" should have been "for line in b: print line". Note from the Author or Editor: Please see: http://code.google.com/p/nltk/issues/detail?id=418 Fixed in online version. Will be fixed in next printing.	Robin Munn	Jul 08, 2009	Jan 01, 2010
PDF	Page 92 Table 3-2	s.titlecase() A titlecased version of the string s => s.title() A titlecased version of the string s Note from the Author or Editor: Please see: http://code.google.com/p/nltk/issues/detail?id=419 Fixed in online version. Will be corrected in next printing.	Anonymous	Jun 24, 2009	Jan 01, 2010
Printed	Page 113 .	the pronunciation of the Chinese character 国 ("country") is given as guo3; it should actually be guo2 Note from the Author or Editor: The proposed correction will be incorporated in the next issue of the book.	Anonymous	May 15, 2013
Printed	Page 115 Immediately after Example 3-3	"The final step is to search for the pattern of zeros and ones that maximizes this objective function, shown in Example 3.10. " Shouldn't it be "The final step is to search for the pattern of zeros and ones that MINIMIZES this objective function..."? At least that's what the anneal function does... Note from the Author or Editor: Addressed in second printing.	Uzi Halaby-Senerman	Sep 02, 2009	Jan 01, 2010
Printed	Page 132 9 lines up	"makes detection is easier" s/b "makes detection easier"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 144 16 lines up	"an empty dictionary" s/b "an empty list"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed, PDF	Page 152 line 17	[len(w) for w in nltk.corpus.brown.sents(categories='news'))] should be [len(w) for w in nltk.corpus.brown.sents(categories='news')] Note from the Author or Editor: Agreed.	Jun Utsumi	Apr 24, 2011
Printed	Page 153 3 lines down	Add quotes around "in-place dictionary" >> add following sentence: (Dictionaries will be presented in Section 5.3.)	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 153 & 154 Bottom and top of 154	Code block spanning page break: variable "trace" should be renamed to "verbose" x4	Anonymous	Dec 16, 2009	Jan 01, 2010
PDF	Page 163 Example 4-6	There is no need to use nltk.defaultdict in this example. The following code works fine. #trie = nltk.defaultdict(dict) trie = {} insert(trie, 'chat', 'cat') insert(trie, 'chien', 'dog') insert(trie, 'chair', 'flesh') insert(trie, 'chic', 'stylish') #trie = dict(trie) # for nicer printing trie['c']['h']['a']['t']['value'] pprint.pprint(trie) Note from the Author or Editor: I agree. The line: trie = nltk.defaultdict(dict) should be changed to: trie = {}	mg6t	Sep 01, 2011
PDF	Page 165 5th line	>>> statement = "random.randint(0, %d) in vocab" % vocab_size * 2 This line should be corrected as follows: >>> statement = "random.randint(0, %d) in vocab" % (vocab_size * 2) Note from the Author or Editor: Confirmed.	mg6t	Aug 28, 2011
Printed	Page 172 3 lines down	"dendogram" s/b "dendrogram"	Anonymous	Dec 16, 2009	Jan 01, 2010
PDF	Page 175 Exercise No. 19	The nltk.corpus.wordnet object does not have path_distance(). It must be path_similarity(). Note from the Author or Editor: The exercise should be changed to specify shortest_path_distance() instead of path_distance().	mg6t	Sep 01, 2011
Printed	Page 177 Example 33	Move to chapter 5 (new exercise 43). Change reference "described in chapter 5" to "described in this chapter"	Anonymous	Dec 16, 2009	Jan 01, 2010
PDF	Page 207 center of the page	The line >>> print nltk.ConfusionMatrix(gold, test) causes an error. It should be >>> print nltk.ConfusionMatrix(gold_tags, test_tags) Note from the Author or Editor: Agreed. The line: print nltk.ConfusionMatrix(gold, test) should be: print nltk.ConfusionMatrix(gold_tags, test_tags)	mg6t	Aug 29, 2011
PDF	Page 273 Example 2-6	In example 7.4, the last statement in the parse method calls the method nltk.chunk.conlltags2tree. In NLTK 2.0.4 nltk.chunk has no such method. The call should be to nltk.chunk.util.conlltags2tree. Note from the Author or Editor: This error has been fixed by changing the imports at the level of the nltk.chunk package.	Peter Haglich	Jan 09, 2013
PDF	Page 284 Code under 7.6	There is a method call to nltk.sem.show_raw_rtuple, which doesn't exist in NLTK 2.0.4. The call should be nltk.sem.relextract.show_raw_rtuple. Note from the Author or Editor: The interface to this module has been updated (see https://raw.github.com/nltk/nltk/master/ChangeLog), and the source files of the book have been revised accordingly.	Peter Haglich	Jan 09, 2013
Printed	Page 306 17 lines up	"The advantages of shift-reduce" s/b "The advantage of shift-reduce"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 309 9 lines up	"through entire list" s/b "through the entire list"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 309 13-14 lines up	"Det at wfst[0][1] and N at wfst[1][2], we can add NP to wfst[0][2]" s/b "Det at wfst[2][3] and N at wfst[3][4], we can add NP to wfst[2][4]"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 334 10 lines down	Delete this whole line, viz "NP[NUM=?n] -> N[NUM=?n]", and close up space.	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 336 Figure 9.1	Larger scale (closer in size to example (18) same page), fix broken vbars (reported as too big last time, but now it is too small.)	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 336 Figure 9-1	Fig 9-1 is too big in the latest pdf. Also, the feature labels shouldn't be bold.	Anonymous	Dec 16, 2009	Feb 01, 2010
Printed	Page 339 Diagram (23)	Incorrect diagram; it should be the one found here: http://nltk.googlecode.com/svn/trunk/doc/book/ch09.html#ex-dag04 Note from the Author or Editor: Already resolved on website and in second printing (December)	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 340 Example 24	s/b smaller for consistency with the other DAGs (cf p339)	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 342 DAG (27a)	DAG (27a) is incorrect. It should look just like (27c) but without the middle arc labeled 'CITY'. (The online version of this chapter is correct, and uses dag04-1.png for this subfigure.)	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 355 Code block	Remove box from code block	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 363 1st paragraph	Within the code following the the sentence "This allows us to parse a query into SQL:", there is a line that assigns a value to the 'answer' variable: answer = trees[0].node['sem'] When I try to enter this line, I get a stack trace followed by this error message: KeyError: 'sem' When I enter answer = trees[0].node['SEM'] I get the prompt back, without any stack trace or error message. Note from the Author or Editor: It should say: answer = trees[0].node['SEM'] Addressed in 2nd printing.	Vance Arocho	Dec 01, 2009	Jan 01, 2010
Printed	Page 363 21 lines down	node['sem'] s/b node['SEM'] NB This is http://www.oreillynet.com/cs/nl/edit/errata/40392	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 373 Approx halfway down the page	"... in this context it could be of some other type, such as <e, e> or <e, <e, t>.": Should be "... in this context it could be of some other type, such as <e, e> or <e, <e, t>>." (unbalanced angle brackets.) Note from the Author or Editor: Addressed in second printing.	Bruce C. Baker	Sep 19, 2009	Jan 01, 2010
Printed	Page 373 19 lines down	"such as <e, e> or <e, <e, t>." s/b "such as <e, e> or <e, <e, t>>."	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 382 Figure (28)	Smaller scale	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 385 Bottom of page	In the following two lines, the string "?subj" should be replaced with "?np" (two substitutions): (30) S[SEM=<?vp(?np)>] -> NP[SEM=?subj] VP[SEM=?vp] (30) tells us that given some sem value ?subj for the subject NP and some sem value ?vp for the VP	Steven Bird	Oct 06, 2010	Nov 01, 2010
Printed	Page 389 17 lines up	"nltk.Variable('z')" s/b "nltk.sem.Variable('z')"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 391 6 lines down	Insert space before "yields"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 392 6 lines down	"nltk.ApplicationExpression(tvp, np)" s/b "nltk.sem.ApplicationExpression(tvp, np)"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 393 8 lines up	semrel s/b semrep	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 393 5 lines up	exists z3.(ankle(z3) & bite(cyril,z3)) s/b all z4.(boy(z4) -> see(cyril,z4))	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 395 8 lines up from bottom	"core", "store" s/b uc in the SEM value of VP	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 395 5-9 lines up	The string "?subj" should be replaced with "?np" (three substitutions): S[SEM=[core=<?vp(?subj)>, store=(?b1+?b2)]] -> NP[SEM=[core=?subj, store=?b1]] VP[SEM=[core=?vp, store=?b2]] The core value at the S node is the result of applying the VP's core value, namely \x.smile(x), to the subject NP's value. The latter will not be @x, but rather an instantiation of @x, say z3. After β-reduction, <?vp(?subj)> will be unified with <smile(z3)>.	Steven Bird	Oct 06, 2010	Nov 01, 2010
Printed	Page 396 20 lines up	"trees[0].node['sem']" s/b "trees[0].node['SEM']"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 399 4 lines down	Det[NUM=sg,SEM=<\P Q.([x],[]) + P(x) + Q(x)>] -> 'a' s/b Det[NUM=sg,SEM=<\P Q.(([x],[]) + P(x) + Q(x))>] -> 'a'	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 400 20 lines down	"trees[0].node['sem'].simplify()" s/b "trees[0].node['SEM'].simplify()"	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 405-406 Examples 5-7	Please replace all seven occurrences of "nltk.ApplicationExpression" with "nltk.sem.ApplicationExpression".	Anonymous	Dec 16, 2009	Dec 01, 2009
Printed	Page 426 top	Currently reads: </sense> <gloss> ... </gloss> <synset> ... </synset> </sense> ... Should be: </sense> <sense> <gloss> ... </gloss> <synset> ... </synset> </sense> ...	Bruce C. Baker	Sep 27, 2009	Jan 01, 2010
Printed	Page 429 11-12 lines down	Sentence beginning with "Ignoring...", please replace with the following (and set "OTH" in cw): Ignoring the entries for exchanges between people other than the top 5 (labeled OTH), the largest value suggests that Portia and Bassanio have the most significant interactions.	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 444 7 lines down	can never been known s/b can never be known	Anonymous	Dec 16, 2009	Jan 01, 2010
Printed	Page 467 Toward bottom of left hand column	"deve-test" -> "dev-test"	Anonymous	Dec 16, 2009	Jan 01, 2010