Errata

Errata for Natural Language Processing with Transformers, Revised Edition

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date submitted
ePub	Page Notebook 02_classification.ipynb Notebook 02_classification.ipynb	Notebook 02_classification.ipynb causes the following problem: emotions_encoded.set_format("torch", columns=["input_ids", "attention_mask", "label"]) The exception report below says that PyTorch is not installed, which is not correct: (ValueError: PyTorch needs to be installed to be able to return PyTorch tensors). Unexpected exception formatting exception. Falling back to standard exception Traceback (most recent call last): File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/tmp/ipykernel_26043/2662095365.py", line 1, in <module> emotions_encoded.set_format("torch", File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/dataset_dict.py", line 583, in set_format writer_batch_size: Optional[int] = 1000, File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/fingerprint.py", line 511, in wrapper File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 2515, in set_format keep_in_memory=keep_in_memory, File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/formatting/__init__.py", line 128, in get_formatter ) File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/tmp/ipykernel_26043/304637329.py", line 1, in <module> emotions_encoded.set_format("torch", File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/dataset_dict.py", line 583, in set_format writer_batch_size: Optional[int] = 1000, File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/fingerprint.py", line 511, in wrapper File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 2515, in set_format keep_in_memory=keep_in_memory, File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/formatting/__init__.py", line 128, in get_formatter ) File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/tmp/ipykernel_26043/304637329.py", line 1, in <module> emotions_encoded.set_format("torch", File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/dataset_dict.py", line 583, in set_format writer_batch_size: Optional[int] = 1000, File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/fingerprint.py", line 511, in wrapper File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 2515, in set_format keep_in_memory=keep_in_memory, File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/formatting/__init__.py", line 128, in get_formatter ) File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/tmp/ipykernel_26043/304637329.py", line 1, in <module> emotions_encoded.set_format("torch", File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/dataset_dict.py", line 583, in set_format writer_batch_size: Optional[int] = 1000, File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/fingerprint.py", line 511, in wrapper File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 2515, in set_format keep_in_memory=keep_in_memory, File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/datasets/formatting/__init__.py", line 128, in get_formatter ) ValueError: PyTorch needs to be installed to be able to return PyTorch tensors. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 2105, in showtraceback stb = self.InteractiveTB.structured_traceback( File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1396, in structured_traceback return FormattedTB.structured_traceback( File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1287, in structured_traceback return VerboseTB.structured_traceback( File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1140, in structured_traceback formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context, File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1055, in format_exception_as_a_whole frames.append(self.format_record(record)) File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/ultratb.py", line 955, in format_record frame_info.lines, Colors, self.has_colors, lvals File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/IPython/core/ultratb.py", line 778, in lines return self._sd.lines File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper value = obj.__dict__[self.func.__name__] = self.func(obj) File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/stack_data/core.py", line 734, in lines pieces = self.included_pieces File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper value = obj.__dict__[self.func.__name__] = self.func(obj) File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/stack_data/core.py", line 681, in included_pieces pos = scope_pieces.index(self.executing_piece) File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper value = obj.__dict__[self.func.__name__] = self.func(obj) File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/stack_data/core.py", line 660, in executing_piece return only( File "/home/silvija/nlp_notebooks/virtualENV/lib/python3.8/site-packages/executing/executing.py", line 190, in only raise NotOneValueFound('Expected one value, found 0') executing.executing.NotOneValueFound: Expected one value, found 0 I found a workaround by running emotions_encoded.set_format("torch“) first and then emotions_encoded.set_format("torch", columns=["input_ids", "attention_mask", "label"]) However, I have a different problem that I don’t know how to solve: After logging into Hugging Face Hub, as instructed in the same notebook (and the textbook), and creating a write access token, I executed the cell: from transformers import Trainer, TrainingArguments batch_size = 64 logging_steps = len(emotions_encoded["train"]) // batch_size model_name = f"{model_ckpt}-finetuned-emotion" training_args = TrainingArguments(output_dir=model_name, num_train_epochs=2, learning_rate=2e-5, per_device_train_batch_size=batch_size, per_device_eval_batch_size=batch_size, weight_decay=0.01, evaluation_strategy="epoch", disable_tqdm=False, logging_steps=logging_steps, push_to_hub=True, log_level="error") When I run the next cell from transformers import Trainer trainer = Trainer(model=model, args=training_args, compute_metrics=compute_metrics, train_dataset=emotions_encoded["train"], eval_dataset=emotions_encoded["validation"], tokenizer=tokenizer) trainer.train(); I get the ‘Repository not found’ error: `git clone` has been updated in upstream Git to have comparable speeds to `git lfs clone`. Cloning into '.'... remote: Repository not found fatal: repository 'huggingface/kokaljfilipovic/distilbert-base-uncased-finetuned-emotion/' not found Error(s) during clone: `git clone` failed: exit status 128	Silvija Kokalj-Filipovic	Jul 24, 2023
Printed	Page Various, see the detail section Various, see the detail section	o p. 252: github-issues-transfomers.jsonl o p. 271: “This is example is about {}” o p. 259: […] there are sophisticated methods than can leverage […] o p. 300 Unlike the code in the others in this book… o p. 302: writing tool or for a building a game. o P. 342: in Chapter 5 --> Chapter 6	Anonymous	Sep 17, 2023
Printed	Page Chap. 10, 323-336 all	It appears that the architecture of GPT-2 is used. What is left a bit unclear is what is the input/output of the model being trained from scratch. My understanding is that GPT-2 is trained to predict (just) the next token. Here, it seems to be different as suggested by Figure10-2. What is exactly the input-output behavior of the network while being trained? Does for each training example, the model mask the last few tokens of the input and sets the output for each training example as identical to the input? If so, where is this masking defined? Please clarify. Is the scheme different?	Anonymous	Oct 19, 2023
Printed	Page Pg 61 After 3rd paragraph	I'm not sure if it's an errata or not, but I have checked several sources and I believe the "self-attention" formula is slightly different. The W_ji should be W_ij I think.	Cayetano Romero	Jan 12, 2024
Printed	Page Chapter 6 part of the code of this chapter	# hide from transformers import pipeline, set_seed It generates the warnings and error message below: ARNING:tensorflow:From C:\Users\ziad.elmously.MLCORP\Anaconda3\envs\LargeLanguageModels\lib\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead. --------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) File ~\Anaconda3\envs\LargeLanguageModels\lib\site-packages\transformers\file_utils.py:2704, in _LazyModule._get_module(self, module_name) 2703 try: -> 2704 return importlib.import_module("." + module_name, self.__name__) 2705 except Exception as e: File ~\Anaconda3\envs\LargeLanguageModels\lib\importlib\__init__.py:127, in import_module(name, package) 126 level += 1 --> 127 return _bootstrap._gcd_import(name[level:], package, level) File <frozen importlib._bootstrap>:1030, in _gcd_import(name, package, level) File <frozen importlib._bootstrap>:1007, in _find_and_load(name, import_) File <frozen importlib._bootstrap>:986, in _find_and_load_unlocked(name, import_) File <frozen importlib._bootstrap>:680, in _load_unlocked(spec) File <frozen importlib._bootstrap_external>:850, in exec_module(self, module) File <frozen importlib._bootstrap>:228, in _call_with_frames_removed(f, args, *kwds) File ~\Anaconda3\envs\LargeLanguageModels\lib\site-packages\transformers\pipelines\__init__.py:54 53 from .question_answering import QuestionAnsweringArgumentHandler, QuestionAnsweringPipeline ---> 54 from .table_question_answering import TableQuestionAnsweringArgumentHandler, TableQuestionAnsweringPipeline 55 from .text2text_generation import SummarizationPipeline, Text2TextGenerationPipeline, TranslationPipeline File ~\Anaconda3\envs\LargeLanguageModels\lib\site-packages\transformers\pipelines\table_question_answering.py:24 22 import tensorflow as tf ---> 24 import tensorflow_probability as tfp 26 from ..models.auto.modeling_tf_auto import TF_MODEL_FOR_TABLE_QUESTION_ANSWERING_MAPPING File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\__init__.py:20 17 # Contributors to the `python/` dir should not alter this file; instead update 18 # `python/__init__.py` as necessary. ---> 20 from tensorflow_probability import substrates 21 # from tensorflow_probability.google import staging # DisableOnExport 22 # from tensorflow_probability.google import tfp_google # DisableOnExport File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\substrates\__init__.py:17 15 """TensorFlow Probability alternative substrates.""" ---> 17 from tensorflow_probability.python.internal import all_util 18 from tensorflow_probability.python.internal import lazy_loader # pylint: disable=g-direct-tensorflow-import File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\__init__.py:138 137 for pkg_name in _maybe_nonlazy_load: --> 138 dir(globals()[pkg_name]) # Forces loading the package from its lazy loader. 141 all_util.remove_undocumented(__name__, _lazy_load + _maybe_nonlazy_load) File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\internal\lazy_loader.py:57, in LazyLoader.__dir__(self) 56 def __dir__(self): ---> 57 module = self._load() 58 return dir(module) File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\internal\lazy_loader.py:40, in LazyLoader._load(self) 39 # Import the target module and insert it into the parent's namespace ---> 40 module = importlib.import_module(self.__name__) 41 if self._parent_module_globals is not None: File ~\Anaconda3\envs\LargeLanguageModels\lib\importlib\__init__.py:127, in import_module(name, package) 126 level += 1 --> 127 return _bootstrap._gcd_import(name[level:], package, level) File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\experimental\__init__.py:31 30 from tensorflow_probability.python.experimental import auto_batching ---> 31 from tensorflow_probability.python.experimental import bijectors 32 from tensorflow_probability.python.experimental import distribute File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\experimental\bijectors\__init__.py:17 15 """TensorFlow Probability experimental bijectors package.""" ---> 17 from tensorflow_probability.python.bijectors.ldj_ratio import forward_log_det_jacobian_ratio 18 from tensorflow_probability.python.bijectors.ldj_ratio import inverse_log_det_jacobian_ratio File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\bijectors\__init__.py:19 17 # pylint: disable=unused-import,wildcard-import,line-too-long,g-importing-member ---> 19 from tensorflow_probability.python.bijectors.absolute_value import AbsoluteValue 20 from tensorflow_probability.python.bijectors.ascending import Ascending File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\bijectors\absolute_value.py:19 17 import tensorflow.compat.v2 as tf ---> 19 from tensorflow_probability.python.bijectors import bijector 20 from tensorflow_probability.python.internal import assert_util File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\bijectors\bijector.py:26 25 from tensorflow_probability.python.internal import auto_composite_tensor ---> 26 from tensorflow_probability.python.internal import batch_shape_lib 27 from tensorflow_probability.python.internal import cache_util File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\internal\batch_shape_lib.py:23 21 import tensorflow.compat.v2 as tf ---> 23 from tensorflow_probability.python.internal import prefer_static as ps 24 from tensorflow_probability.python.internal import tensor_util File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\internal\prefer_static.py:26 25 from tensorflow_probability.python.internal import tensorshape_util ---> 26 from tensorflow_probability.python.internal.backend import numpy as nptf 28 from tensorflow.python.framework import ops # pylint: disable=g-direct-tensorflow-import File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\internal\backend\numpy\__init__.py:18 17 from tensorflow_probability.python.internal.backend.numpy import __internal__ ---> 18 from tensorflow_probability.python.internal.backend.numpy import bitwise 19 from tensorflow_probability.python.internal.backend.numpy import compat File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\internal\backend\numpy\bitwise.py:19 17 import numpy as np ---> 19 from tensorflow_probability.python.internal.backend.numpy import _utils as utils 21 __all__ = [ 22 'bitwise_xor', 23 'left_shift', 24 ] File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\internal\backend\numpy\_utils.py:22 21 import numpy as np ---> 22 from tensorflow_probability.python.internal.backend.numpy import nest 24 try: File ~\AppData\Roaming\Python\Python39\site-packages\tensorflow_probability\python\internal\backend\numpy\nest.py:30 29 import types ---> 30 import tree as dm_tree 32 # pylint: disable=g-import-not-at-top ModuleNotFoundError: No module named 'tree' The above exception was the direct cause of the following exception: RuntimeError Traceback (most recent call last) Cell In[1], line 2 1 # hide ----> 2 from transformers import pipeline, set_seed File <frozen importlib._bootstrap>:1055, in _handle_fromlist(module, fromlist, import_, recursive) File ~\Anaconda3\envs\LargeLanguageModels\lib\site-packages\transformers\file_utils.py:2694, in _LazyModule.__getattr__(self, name) 2692 value = self._get_module(name) 2693 elif name in self._class_to_module.keys(): -> 2694 module = self._get_module(self._class_to_module[name]) 2695 value = getattr(module, name) 2696 else: File ~\Anaconda3\envs\LargeLanguageModels\lib\site-packages\transformers\file_utils.py:2706, in _LazyModule._get_module(self, module_name) 2704 return importlib.import_module("." + module_name, self.__name__) 2705 except Exception as e: -> 2706 raise RuntimeError( 2707 f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its traceback):\n{e}" 2708 ) from e RuntimeError: Failed to import transformers.pipelines because of the following error (look up to see its traceback): No module named 'tree' Please note that I have already installed the required packages using the command: !pip install -r requirements.txt I am running the script in Jupyter Notebook.	Ziad Elmously	Jan 25, 2024
Printed	Page page 74 last chapter	Book Natural Language Processing mit Transformatoren (Deutsch) training_args = TrainingArguments(output_dir=model_name, num_train_epochs=2, learning_rate=2e-5, per_device_train_batch_size=batch_size, per_device_eval_batch_size=batch_size, weight_decay=0.01, evaluation_strategy="epoch", disable_tqdm=False, logging_steps=logging_steps, push_to_hub=True, log_level="error") I have installed `pip install transformers[torch]` and `pip install accelerate -U, but nevertheless I get an error. What to do? In this book are many errors in the notebooks and it is very laborious to work with this book. J. van der List	Juergen van der List	Jan 27, 2024
Printed	Page 6, Transfer Learning in NLP 1st paragraph, third sentence	The sentence: Architecturally, this involves, splitting the model into of a body and a head, ... Should be without "of": Architecturally, this involves, splitting the model into a body and a head, ...	Velimir Graorkoski	Jan 08, 2024
Printed	Page 23 2nd code cell	I am getting this error (which sounds a lot like one what was thought to have been fixed): FileNotFoundError: Couldn't find file at <link to dropbox file> WORK-AROUND: emotions = load_dataset("dair-ai/emotion")	Eric Cooper	Aug 15, 2023
Printed	Page 61 print(tokenize(emotions['train'][:2]	TypeError Traceback (most recent call last) <ipython-input-92-195b9a5c839d> in <cell line: 1>() ----> 1 print(tokenize(emotions['train'][:2])) <ipython-input-89-f9c17701f610> in tokenize(batch) 1 def tokenize(batch): ----> 2 return tokenizer(batch(['text'],padding = True,truncation=True)) TypeError: 'dict' object is not callable	Juergen van der List	Dec 29, 2023
Printed	Page 68 (German edition) last paragraph	If I start this Python-block: from umap import UMAP from sklearn.preprocessing import MinMaxScaler # Scale features to [0,1] range X_scaled = MinMaxScaler().fit_transform(X_train) # Initialize and fit UMAP mapper = UMAP(n_components=2, metric="cosine").fit(X_scaled) # Create a DataFrame of 2D embeddings df_emb = pd.DataFrame(mapper.embedding_, columns=["X", "Y"]) I get the following error: ImportError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_17500\2438143818.py in <module> ----> 1 from umap import UMAP 2 from sklearn.preprocessing import MinMaxScaler 3 4 # Scale features to [0,1] range 5 X_scaled = MinMaxScaler().fit_transform(X_train) ImportError: cannot import name 'UMAP' from 'umap' (C:\Users\vdl\Anaconda3\lib\site-packages\umap\__init__.py) df_emb["label"] = y_train But I have umap istalled with: !pip install umap and the answer was: Requirement already satisfied: umap in c:\users\vdl\anaconda3\lib\site-packages (0.1.1) What to do? Best regards, JvdL	Juergen van der List	Jan 23, 2024
Printed	Page 69 1st paragraph, 3rd line	Hello! I suspect that 'hidden_dim]' should be 'embed_dim]', pls. verify (and thank you, great read so far, clearly explained).	Jo De Baer	Mar 06, 2024
Printed	Page 75 (german edition) 1rd paragraph	trainer = Trainer(model=model, args=training_args, compute_metrics=compute_metrics, train_dataset=emotions_encoded["train"], eval_dataset=emotions_encoded["validation"], tokenizer=tokenizer) trainer.train(); This raised the following error: CalledProcessError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/huggingface_hub/repository.py in clone_from(self, repo_url, token) 668 --> 669 run_subprocess( 670 # 'git lfs clone' is deprecated (will display a warning in the terminal) 8 frames CalledProcessError: Command '['git', 'lfs', 'clone', 'xxx', '.']' returned non-zero exit status 2. During handling of the above exception, another exception occurred: OSError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/huggingface_hub/repository.py in clone_from(self, repo_url, token) 707 708 except subprocess.CalledProcessError as exc: --> 709 raise EnvironmentError(exc.stderr) 710 711 def git_config_username_and_email(self, git_user: Optional[str] = None, git_email: Optional[str] = None): OSError: WARNING: 'git lfs clone' is deprecated and will not be updated with new flags from 'git clone' 'git clone' has been updated in upstream Git to have comparable speeds to 'git lfs clone'. Cloning into '.'... remote: Repository not found fatal: repository 'xxx' not found Error(s) during clone: git clone failed: exit status 128	Gunold Brunbauer	Jul 07, 2023
Printed	Page 132 Body of function sequence_logprob	In the definition of the function, the sum of log probabilities seq_log_prob is obtained by computing the following: torch.sum(log_probs[:, input_len:]) I think the correct way to obtain the sum of log probabilities is to compute: torch.sum(log_probs[:, input_len-1:]) This is because we also want to take into account the log probability of the first token that was generated and that directly follows the input sequence (this log probability is located at index input_len -1 since input sequence tokens are located from index 0 to index input_len-1).	Clément Luneau	Jul 19, 2023
Printed	Page 161 2nd paragraph	The sentence: "Then, as usual, we set up a the TrainingArguments for training:" Word "a" is sufficient, should be removed.	Velimir Graorkoski	Jan 15, 2024
Printed	Page 173, Extracting Answers from Text 2nd paragraph, second sentence under the section Extracting Answers from Text	The sentence: "For example, if a we have a question like ..." Should be (without the "a"): "For example, if a we have a question like ..."	Velimir Graorkoski	Jan 15, 2024
Printed	Page 220 (German Edition) 6th paragraph	The code: "!curl -X GET 'localhost:9200/?pretty' does not work with the actual edition of elasticsearch, but it works with the version 7.9.3	Gunold Brunbauer	Jul 31, 2023
Printed	Page 254 3rd paragraph	In 3rd paragraph, "Next let's take a look at the top 10 most frequent~" should be "Next let's take a look at the top 8 most frequent~". Thanks	Haesun Park	Aug 29, 2022
Printed	Page 260 1st line	In header at first line, "Implementing Naive Bayesline" should be "Implementing Naive Bayes" or "Implementing Naive Bayes baseline". Thanks.	Haesun Park	Aug 29, 2022
Printed	Page 265, Working with No Labeled Data Table 9-1	The name of the table contains MLNI acronym instead of MNLI.	Velimir Graorkoski	Jan 20, 2024
Printed	Page 269 13th line and 15th line from the bottom	In 13th line and 15th line from the bottom, "Best threshold (micro)" should be "Best threshold (macro)". Thanks.	Haesun Park	Aug 29, 2022
Printed	Page 271, Working with No Labeled Data 2nd paragraph from the end, 2nd bullet point	The hypothesis mentioned in the point states: "This is example is about". One "is" is sufficient.	Velimir Graorkoski	Jan 21, 2024
Printed	Page 273 code block in the middle	In recent version of nlpaug, aug.augment(text) always returns a list even if n=1. So, `text_aug += [aug.augment(text)]` should be `text_aug += aug.augment(text)` Thanks	Haesun Park	Oct 14, 2022
Printed	Page 274 4th line from the bottom	In 4th line from the bottom, What does "around 5 point" mean? 5 percent point or anything else? Please give me some explanation. Thanks.	Haesun Park	Aug 29, 2022
Printed	Page 283 6th line from the bottom	In 6th line from the bottom, "k/n elements to compare" should be "n/k elements to compare". Thanks.	Haesun Park	Aug 30, 2022
Printed	Page 291 5th, 6th line from the top	In 5th, 6th line from the top, original_input_ids and masked_input_ids are not defined. I think they are inputs["input_ids"][0] and outputs["input_ids"][0]. Thanks.	Haesun Park	Aug 29, 2022
Printed	Page 308 6th line from the bottom	In 6th line from the bottom, "This reduces the memory footprint of our dataset from 180 GB to 50 GB". So, when streaming codeparrot dataset, 50 GB memory is needed? Thanks.	Haesun Park	Sep 08, 2022
Printed	Page 337 1st line from the top	In 1st line from the top, I suggest to change "reduce pattern" to "all-reduce pattern" for clear explanation. Thanks	Haesun Park	Sep 08, 2022
Printed	Page 341 1st paragraph	In 1st paragraph, "it didn't quite get it right in the second attempt". But second attempt is only correct answer. Please let me know the meaning. Thanks.	Haesun Park	Sep 08, 2022
Printed	Page 360 Last code block	In last code block, `if` and `else` block have same code. Thanks.	Haesun Park	Sep 13, 2022
Printed	Page 361 1st paragraph	In 1st paragraph, "For the first chapter, the model predict..." should be "For the first question, the model predict..." Thanks	Haesun Park	Sep 13, 2022