Errata

Python for Data Analysis

Errata for Python for Data Analysis

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date submitted
Other Digital Version Preface
Using Code Examples

Words wrong way around on wesmckinney.com

You can data find files

should be

You can find data files

Steven Mooney  Feb 16, 2024 
Printed, ePub Page Section 3.1, page 59
1st paragraph

The example:
In [118]: hash("string")
Out [118]: 3634226001988967898

However, when I did it I got inconsistent results from the hash function.
below are examples of the result from running the function 4 consecutive times:
-783493489962912440
-2593540438211823544
5958934601557521611
1519405966352344185

Thus this function could not be used to verify the object "string" could be used as a dictionary key.

I am using an 2021 iMac with an Apple M1 chip, 16 GB memory, and macOS Sonoma 14.2.1

I am using PyCharm 2023.3.3 (Community Edition)
Build #PC-233.13763.11, built on January 25, 2024
Runtime version: 17.0.9+7-b1087.11 aarch64
VM: OpenJDK 64-Bit Server VM by JetBrains s.r.o.
macOS 14.2.1
GC: G1 Young Generation, G1 Old Generation
Memory: 2048M
Cores: 8
Metal Rendering is ON
Registry:
ide.experimental.ui=true
Non-Bundled Plugins:
com.jetbrains.edu (2024.1-2023.3-882)

Patrick Salkeld  Feb 16, 2024 
Other Digital Version Section 5.2; Indexing, Selection, and Filtering
Selection on DataFrame with loc and iloc

The word rows is misspelled as "roles".

The result of selecting a single row is a Series with an index that contains the DataFrame's column labels. To select multiple roles, creating a new DataFrame, pass a sequence of labels:

Andrei  Feb 17, 2024 
Other Digital Version Generator expressions
3rd code listing

syntax typo for the statement `dict((i, i **2) for i inrange(5))`
should have a space between the keywords `in` and `range`.

Ben To  Feb 19, 2024 
Other Digital Version Set
hashable set elements part

just missing a space before the **first parenthesis** in the sentence "set elements generally must be immutable, and they must be hashable(which means that calling hash on a value does not raise an exception)."

Ben To  Feb 19, 2024 
Printed, ePub Page Page 98, Section 4.1
First example, first 3 paragraphs

When tried to duplicate this example:
names = np.array(["Bob", "Joe", "Will", "Bob", "Will", "Joe", "Joe"])
data = ([[4, 7], [0,2], [-5, 6], [0, 0],[1, 2], [-12, -4], [3, 4]])
names == "Bob"
data[names == "Bob"]

I got this error:
Traceback (most recent call last):
File "/Volumes/Extreme SSD/Python Data Analysis/Python3_for_Data_Analysis/main.py", line 550, in <module>
data[names == "Bob"]
~~~~^^^^^^^^^^^^^^^^
TypeError: only integer scalar arrays can be converted to a scalar index

This contradicts the subsequent text which states:
"...You can even mix and match Boolean arrays with slices or integers (or sequences of integers; more on this later)."

Patrick Salkeld  Feb 19, 2024 
Other Digital Version Chapter 4 - Data Types for ndarrays
Second note box

Where the online text says "A signed integer can represent both positive and negative integers, while an unsigned integer can only represent nonzero integers", the phrase "nonzero integers" should be "non-negative integers".

Ben To  Mar 04, 2024 
O'Reilly learning platform Page Chapter 10.x
Throughout the chapter

Chapter 10 uses DataFrame.groupby(...,axis="columns") on several occasions, which is deprecated.

Jochen Schüttler  Apr 09, 2024 
Other Digital Version 4.2 Pseudorandom Number Generation
Table 4.3: NumPy random number generator methods

duplicate `uniform` function listed in the table

Ben To  Mar 09, 2024 
Other Digital Version 4.4 Array-Oriented Programming with Arrays
first code listing

In [169]: points = np.arange(-5, 5, 0.01) # 100 equally spaced points

But this results in "1000" points.

Ben To  Mar 11, 2024 
ePub Page 5 Indexing, Selection and Filtering
Using Code Examples

In the following sentence should 'columns' be changed to 'rows'. When I test this, it prints 2 rows and all the columns.

The row selection syntax data[:2] is provided as a convenience. Passing a single element or a list to the [] operator selects columns.


Steven Mooney  Feb 21, 2024 
O'Reilly learning platform Page 10.2
6th code box, In [72]

The code example is "grouped_pct.agg([("average", "mean"), ("stdev", np.std)])". There is a FutureWarning to use "grouped_pct.agg([("average", "mean"), ("stdev", "std")]) instead.

Jochen Schüttler  Apr 09, 2024 
Other Digital Version 13.3 US Baby Names In[116]
China edition page415

According to the up code block: def~~
In[116]: names Out[116]: table maybe wrong.
It should be
name sex births year prop
year sex
1880 F 0 Mary F 7065 1880 0.077643
1 Anna F 2604 1880 0.028618
2 Emma F 2003 1880 0.022013
3 Elizabeth F 1939 1880 0.021309
4 Minnie F 1746 1880 0.019188
... ... ... ... ... ... ... ...
2010 M 1690779 Zymaire M 5 2010 0.000003
1690780 Zyonne M 5 2010 0.000003
1690781 Zyquarius M 5 2010 0.000003
1690782 Zyran M 5 2010 0.000003
1690783 Zzyzx M 5 2010 0.000003

Zhang yingtan  Mar 19, 2024 
Printed Page 169
In[285]

In[283] and In[285] look exactly the same even though line above says that you could include more concise syntax.

Jude Cancellieri  Mar 09, 2024