Errata

Natural Language Processing with Transformers

Errata for Natural Language Processing with Transformers

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date Submitted
Printed Page 68
Second paragraph (beginning with "Here we've initialized...")

"in practice it is chosen to be a multiple of embed_dim" should be changed to "in practice it is chosen to be a factor of embed_dim" - or say that head_dim is chosen so that embed_dim is a multiple of head_dim, or...

Anonymous  Jan 25, 2024 
ePub Page 54
A First Look at Hugging Face Datasets

The "emotion" dataset does not exist.

Rod Monk  Nov 16, 2023 
Chapter 4, Error Analysis
looking at the online version, so I'm not sure (confusion matrix)

To create the confusion matrix, you're using:

plot_confusion_matrix(df_tokens["labels"], df_tokens["predicted_label"],
tags.names)

But this gives a plot that's not correct. It should be (as per your function definition):
plot_confusion_matrix(df_tokens["predicted_label"], df_tokens["labels"],
tags.names)

I realized this by computing the confusion matrix "by hand" and getting a different result:
conf_matrix = df_tokens.loss.groupby([df_tokens.labels, df_tokens.predicted_label]).count().unstack()
conf_matrix = conf_matrix.div(conf_matrix.sum(axis=1), axis=0)

Daniel Vaughan  Sep 20, 2023 
Printed Page Page xvii
Chapter 8 item

In the penultimate line of this item, it says "[...] explore techniques such a knowledge distillation [...]". Should be "such as"

Sofia Oliveira  Oct 17, 2022 
Printed Page Page xvi
First paragraph of "Who is this for"

On the fourth line of the paragraph, in the sentence, "[...] we assume you are comfortable programming in Python and has a basic understanding of Deep Learning frameworks [...]", I think it should be "have" instead of "has"

Sofia Oliveira  Oct 17, 2022 
Printed Page 69, 71
second code block

attn_output is used on page 69 but attn_outputs is used on page 71; likely you meant the former to be plural, too.

Robert Rodger  Jul 13, 2022 
Printed Page 103
2nd paragraph

XLM-R tokenizer return input IDs and attention mask.
So, in 2nd line and 3rd line in 2nd paragraph,

"XLM-R tokenizer returns the input IDs for the model’s inputs, we just need to augment this information with the attention mask and the label IDs.."

should be

"XLM-R tokenizer returns the input IDs and attention mask for the model’s inputs, we just need to augment this information with the label IDs..".

Thanks.

Haesun Park  Jul 04, 2022 
Printed Page 103
3rd paragraph

Documentation link may be "https://bit.ly/3y6z3kc", not "https://oreil.ly/lGPgh".
Thanks.

Haesun Park  Jul 04, 2022 
Chapter 7
Middle section

The text reads:
"In our case, we need a node to evaluate the retriever, so we’ll use the
EvalRetriever class whose run() method keeps track "

The class is called "EvalDocuments", not "EvalRetriever", so it should read:

"In our case, we need a node to evaluate the retriever, so we’ll use the
EvalDocuments class whose run() method keeps track "

Carlos Aguayo  Feb 18, 2022 
Chapter 7
First "Note"

The text reads:
"community QA involves gathering question-answer pairs that are generated by users on forums like Stack Overflow, and then using semantic similarity search to find the closest matching answer to a new question".

I think it was meant to say "find the closest matching question to a new question".

Carlos Aguayo  Feb 15, 2022