Errata

Errata for Natural Language Processing with PyTorch

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date submitted
	- Example 5-14. Implementing the NewsClassifier	on the example 5-14 the __init__ method documentation mentions "filter_width (int): width of the convolutional kernels" but that is not a parameter to that method.	Anonymous	Feb 12, 2020
	? Example 1-2	I wondered why the figure in Example 1-2 doesn't show 0's in the "flies" column, since the text emphasizes that words occurring in all documents should have an TF-IDF of 0. According to the sklearn documentation, its TF-IDF calculator uses nonstandard definitions of IDF.	Anonymous	May 05, 2020
ePub,	Page Preface Using Code Examples Section	On the Preface of the book the Supplemental material (code examples, exercises, etc.) link is unavailable. nlproc.info/PyTorchNLPBook/repo/ does not work anymore. Is there a different URL or could this one be updated?	Anonymous	May 08, 2023
	1 Figure 1-4	In Chapter 1. Introduction under the section "TF Representation", Figure 1-4. The collapsed one-hot representation generated by Example 1-1 contains only 7 columns. the field 'a' seems to be missing.. The length of the columns should be 8 i believe..	Navin Kumar GopalaKrishnan	Feb 28, 2019
	5 Example 4-8	in "Example 4-8. Hyperparameters and program options for the MLP-based Yelp review classifier" a comma is missing after "hidden_dim=300"	Anonymous	Feb 09, 2020
Printed	Page 7 Last line of code on the page	- The output shows two y labels which are Sentence 1 and Sentence 2, however the code only includes Sentence 2. - Also, vocab is not defined in the code which returns an error. The code should be: from sklearn.feature_extraction.text import CountVectorizer import seaborn as sns corpus = ['Time flies like an arrow.', 'Fruit flies like a banana.'] one_hot_vectorizer = CountVectorizer(binary=True) one_hot = one_hot_vectorizer.fit_transform(corpus).toarray() vocab = one_hot_vectorizer. get_feature_names() sns.heatmap(one_hot, annot=True, cbar=False, xticklabels=vocab, yticklabels=['Sentence 1', 'Sentence 2'])	Shareefa Alamer	Feb 08, 2021
Printed	Page 7 - 9 Code and output	Why does the output return only 7 tokens missing the 'a' token?	Shareefa Alamer	Feb 08, 2021
Printed	Page 111 Figure 5.1	"By definition, the weight matrix of a Linear layer that accepts as input this one-hot vector must have the same number of rows as the size of the one-hot vector..." one-hot vector must have the same number of columns according to Fig.5.1	Alexander	Nov 12, 2019
Printed	Page 169-170 full page and code snippet	Hi, I have question about the approach used in the surname generator using LSTM. I was expecting to see a vectorizer proposing a list of couples (from_vector, to_vector) where real index (not begin/end ones) will have been removed. In a sense, I was expecting the LSTM being trained as a generative surname at character level. Example with my last name : MASSOT, I was expecting the following first and last tuples assuming a vector_length set to 8. <begin> M <mask> <mask> <mask> <mask> <mask> <mask> <begin> M A <mask> <mask> <mask> <mask> <mask> <begin> M A S S O T <mask> <begin> M A S S O T <end> The proposed method produces the following tuple (if correctly understood) : <begin> M A S S O T M A S S O T <end> The consequence if I understand correctly is that the LSTM is trained as a "blocky" surname generator, able to predict a sequence of characters following a <begin> and terminated by a <end> as block, but not generated sequentially at character level. Am I correct? Thanks Best regards Jerome	Jerome MASSOT	Jul 18, 2020