Errata

Natural Language Processing with PyTorch

Errata for Natural Language Processing with PyTorch

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date submitted
-
Example 5-14. Implementing the NewsClassifier

on the example 5-14 the __init__ method documentation mentions "filter_width (int): width of the convolutional kernels" but that is not a parameter to that method.

Anonymous  Feb 12, 2020 
?
Example 1-2

I wondered why the figure in Example 1-2 doesn't show 0's in the "flies" column, since the text emphasizes that words occurring in all documents should have an TF-IDF of 0. According to the sklearn documentation, its TF-IDF calculator uses nonstandard definitions of IDF.

Anonymous  May 05, 2020 
ePub, Page Preface
Using Code Examples Section

On the Preface of the book the Supplemental material (code examples, exercises, etc.) link is unavailable. nlproc.info/PyTorchNLPBook/repo/ does not work anymore. Is there a different URL or could this one be updated?

Anonymous  May 08, 2023 
1
Figure 1-4

In Chapter 1. Introduction under the section "TF Representation",

Figure 1-4. The collapsed one-hot representation generated by Example 1-1

contains only 7 columns. the field 'a' seems to be missing.. The length of the columns should be 8 i believe..

Navin Kumar GopalaKrishnan  Feb 28, 2019 
5
Example 4-8

in "Example 4-8. Hyperparameters and program options for the MLP-based Yelp review classifier" a comma is missing after "hidden_dim=300"

Anonymous  Feb 09, 2020 
Printed Page 7
Last line of code on the page

- The output shows two y labels which are Sentence 1 and Sentence 2, however the code only includes Sentence 2.

- Also, vocab is not defined in the code which returns an error.

The code should be:

from sklearn.feature_extraction.text import CountVectorizer
import seaborn as sns
corpus = ['Time flies like an arrow.',
'Fruit flies like a banana.']
one_hot_vectorizer = CountVectorizer(binary=True)
one_hot = one_hot_vectorizer.fit_transform(corpus).toarray()
vocab = one_hot_vectorizer. get_feature_names()
sns.heatmap(one_hot, annot=True,
cbar=False, xticklabels=vocab,
yticklabels=['Sentence 1', 'Sentence 2'])

Shareefa Alamer  Feb 08, 2021 
Printed Page 7 - 9
Code and output


Why does the output return only 7 tokens missing the 'a' token?

Shareefa Alamer  Feb 08, 2021 
Printed Page 111
Figure 5­.1

"By definition, the weight matrix of a Linear layer that accepts as input this one-­hot vector must have the same number of rows as the size of the one-­hot vector..."
one-­hot vector must have the same number of columns according to Fig.5.1

Alexander  Nov 12, 2019 
Printed Page 169-170
full page and code snippet

Hi,

I have question about the approach used in the surname generator using LSTM.

I was expecting to see a vectorizer proposing a list of couples (from_vector, to_vector) where real index (not begin/end ones) will have been removed. In a sense, I was expecting the LSTM being trained as a generative surname at character level.

Example with my last name : MASSOT, I was expecting the following first and last tuples assuming a vector_length set to 8.

<begin> M <mask> <mask> <mask> <mask> <mask> <mask>
<begin> M A <mask> <mask> <mask> <mask> <mask>

<begin> M A S S O T <mask>
<begin> M A S S O T <end>

The proposed method produces the following tuple (if correctly understood) :

<begin> M A S S O T
M A S S O T <end>

The consequence if I understand correctly is that the LSTM is trained as a "blocky" surname generator, able to predict a sequence of characters following a <begin> and terminated by a <end> as block, but not generated sequentially at character level.

Am I correct?

Thanks

Best regards

Jerome


Jerome MASSOT  Jul 18, 2020