Errata

Errata for Hands-On Large Language Models

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date submitted
PDF	Page Page 52, "Generating your First text" code section	while trying to run the example using the model microsoft/Phi-3-mini-4k-instruct, I encountered the following error during execution: AttributeError: 'DynamicCache' object has no attribute 'get_max_length'. Did you mean: 'get_seq_length'? I have already tried updating the transformers library to version 4.52.4, which should be compatible with the Phi-3 model. I also cleared the Hugging Face cache using huggingface-cli delete-cache and reinstalled all relevant packages. Despite these steps, the issue remains unresolved and the same error keeps appearing. It seems like the issue is related to how the model handles past_key_values during text generation, particularly with DynamicCache.	Pablo Garrido	May 31, 2025
PDF	Page 33 before the code "from transformers import AutoModelForCausalLM, AutoTokenizer"	There will be an error for the code if you miss for following installation in the beginning. !pip install -q transformers==4.41.2 accelerate The default version of transformers now is 4.52.4, which will cause the following error: AttributeError: 'DynamicCache' object has no attribute 'get_max_length'	凌星寒	Jun 21, 2025
Printed	Page 41 and 44 Figures 2-4 and 2-5	Figures 2-4 and 2-5 show the same token ID for different tokens: the -> 278 b -> 278 A token identifier must be unique and the text is aligned with this idea, however the diagrams show as counter examples of this property.	Pablo Francisco Pérez Hidalgo	Aug 11, 2025
Printed	Page 49 Last 2 Bullet Points	On Page 49, the last two points explaining the insights/observations of GPT-2 Tokenizer are exactly the same, now either by mistake the same point has been reprinted twice, or instead of one more valid insight/observation, it has been replaced with the previous one. Please check it out & fix it!	Harshit Dawar	May 27, 2025
Printed	Page 51 Last Paragraph for GPT-4 (Bullet Points to differentiate between GPT-4 & GPT-2 Tokenizer)	On Page 51, in the section explaining observations of GPT-4 Tokenizer, in the 3rd point of that, its mentioned that word "tokens" has been represented by using 1 token, however, if you see the output of GPT-4 Tokenizer given just above the observations at the same page, the word "_tokens" has been marked as a single word. Now, either of the two cases are possible: 1. "_tokens" is marked as 1 word by mistake in the output of GPT-4 Tokenizer, in reality, it should be 2 tokens i.e., "_" & "tokens". 2. Observation explanation is wrong, instead of mentioning "tokens" is represented using 1 token, it should be written as "_tokens" is represented using 1 token. Please check this out & fix this!	Harshit Dawar	May 27, 2025
Printed	Page 52 13th line from the bottom	'This is an encoder that forcuses on code generation' should be 'This is an decoder that forcuses on code generation'	Haesun Park	Jun 21, 2025
PDF	Page 54 Right before the recap	At the end of page 53 you mentioned the Phi-3 (and Llama2) tokenizer and then explain its characteristics, but never show the actual result of the tokenization. It is only possible to see it in page 55 in the recap table. So even if it is possible to see it, it breaks the flow of the reader and the structure of the book since it showed the tokenization result for all the other alternatives.	Ivan Castano	Sep 16, 2025
Printed	Page 77 In Figure 3-5	In right upper fig. and right lower fig., token ID 50,000 should be 49,999.	Haesun Park	Jun 21, 2025
Printed	Page 90, 91 In Figure 3-16, 3-17	In two figures, 'combining information' seems to refer the Linear layer and included in attention head. But in figure 3-26, this Linear layer is shown separately from the attention heads. This could be confusing to readers.	Haesun Park	Jun 21, 2025
PDF	Page 91 Figure 3-17	Per description "Figure 3-17 shows the intuition of how attention heads run in parallel with a preceding step of splitting information and a later step of combining the results of all the heads." However, Figure 3-17 only shows one attention head and doesn't have the step of "combining the results of all the heads".	凌星寒	Jun 24, 2025
Printed	Page 103 3th line from the bottom	'that capture abolute and relative token position information' should be 'that capture relative token position information'	Haesun Park	Jun 21, 2025
Printed	Page 112 3th line from the top	'both representation and langauge models' should be 'both representation and generative langauge models'	Haesun Park	Jun 21, 2025
Printed	Page 113 1st paragraph	Here, it suggests evaluating generalization on the validation set when hyperparameter tuning is done using the training and test sets, but this is not the standard practice. Hyperparameter tuning should use the validation set, while the test set should be used only once at the end to assess the final generalization performance of the trained model.	Haesun Park	Jun 21, 2025
Printed	Page 125 In Figure 4-15	'The cosine similarity is the angle between two vectors' should be 'The cosine similarity is the cosine of the angle between two vectors'	Haesun Park	Jun 21, 2025
Printed	Page 129 In Figure 4-19	'a decoder-encoder architecture' should be 'a encoder-decoder architecture'.	Haesun Park	Jun 21, 2025
Printed	Page 163 Code snippet	The correct code is: # Visualize topics and documents fig = topic_model.visualize_document_datamap( titles, topics=list(range(20)), reduced_embeddings=reduced_embeddings, width=1200, #label_font_size=11, # les paramètres label_font_size, label_wrap_width et use_medoids ne font pas partie des arguments officiels de visualize_document_datamap dans BERTopic #label_wrap_width=20, #use_medoids=True, datamap_kwds={ # Use this dictionary for advanced settings instead parameters "label_font_size": 11, "label_wrap_width": 20, "use_medoids": True, }, ) plt.savefig("datamapplot.png", dpi=300)	ERIC MEURVILLE	Jul 01, 2025
Printed	Page 168 Figure 6-1	On Figure 6-1, the description of the LLama2 model is 7B/13B70B. It should be 7B/13B/70B (missing / between 13B and 70B).	Theodoros Athanasiadis	Jun 19, 2025
Printed	Page 179 Bottom line	'adding it to the `data` variable' should be 'adding it to the `text` variable'	Haesun Park	Jun 21, 2025
Printed	Page 181 In Figure 6-13	The caption of Figure 6-13 is duplicated from that of Figure 6-11.	Haesun Park	Jun 21, 2025
Printed	Page 191 2nd line above 'Ouput Verification' section.	'such a conservation' should be 'such a conversation'	Haesun Park	Jun 21, 2025
Printed	Page 201 figure 7-2	The float 16-bit representation is incorrect. Float 16-bit (Half Precision) in IEEE 754 Format: - Bit 15: Sign - Bits 14–10: Exponent (5 bits and not 8!, biased by 15) - Bits 9–0: Mantissa (10 bits, with implicit leading 1 for normalized numbers)	ERIC MEURVILLE	Jul 01, 2025
Printed	Page 204 2nd line below Figure 7-5	`system_prompt` is not included in the template. And there is no need to include <s> token in the template because llama-cpp-python automatically add it.	Haesun Park	Jun 21, 2025
Printed	Page 207 Top of the page	llm_chain.invoke("a girl that lost her mother") should be llm_chain.invoke({"summary" : "A girl that lost her mother"})	Soner Balkir	Aug 24, 2025
Printed	Page 210 In Figure 7-10	'Conversation history' should be 'Current conversation'.	Haesun Park	Jun 21, 2025
Printed	Page 213 Top of the page	llm_chain.predict({"input_prompt":"What is 3 + 3?"}) should be replaced by llm_chain.invoke({"input_prompt":"What is 3 + 3?"})	Eric Meurville	Jul 10, 2025
Printed	Page 216 Top of the page	# Generate a conversation and ask for the name llm_chain.invoke({"input_prompt": "Hi! My name is Maarten. What is 1 + 1?"}) llm_chain.invoke({"input_prompt": "What is my name?"}) should return {'input_prompt': 'What is my name?', 'chat_history': ' Maarten introduces himself and asks for the sum of 1 + 1, to which the AI responds that it equals 2. The AI provides a brief explanation about addition being a basic arithmetic operation resulting from combining numbers, in this case adding one unit to another to get a total of two units or items.', 'text': " Your name was mentioned as Maarten when you introduced yourself; therefore, based on the current conversation, your name is Maarten.\nHere's an explanation for 1 + 1 = 2: Addition is one of the four fundamental arithmetic operations and involves combining quantities. When we add 1 unit to another 1 unit, we are essentially counting up by one from the first number (which is 1), arriving at a total of two units or items."}	Eric Meurville	Jul 10, 2025
Printed	Page 298 18th line from the bottom	'similarity scores between 1 and 5' should be 'similarity scores between 0 and 5'	Haesun Park	Jun 21, 2025
Printed	Page 299 17th line from the top	'during evaluation' should be 'during training' in explanation of per_device_train_batch_size.	Haesun Park	Jun 21, 2025
Printed	Page 327 12th line from the bottom	`trainer.train()` is omitted.	Haesun Park	Jun 21, 2025
Printed	Page 354 3th line from the bottom	'Using a two-step process' should be 'Using a three-step process'.	Haesun Park	Jun 21, 2025
Printed	Page 371 3rd last paragraph	In the 'Training Configuration' section of 'Instruction Tuning with QLoRA' in chapter 12, it is stated regarding the 'num_train_epochs' parameter: 'The total number of training rounds. Higher values tend to degrade performance so we generally like to keep this low.' Don't higher values typically lead to better performance?	Marcus Fraaß	May 16, 2025