Regular Expressions Cookbook

Errata for Regular Expressions Cookbook

Submit your own errata for this product.


The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.


Color Key: Serious Technical Mistake Minor Technical Mistake Language or formatting error Typo Question Note Update



Version Location Description Submitted By Date Submitted
PDF Page 26
Solution, line 2

Ruby 1.9 regular expression editor (rubular.com) shows the following error in this code: Forward slashes must be escaped

Anonymous  Mar 09, 2012 
PDF Page 32
Shorthands, paragraph 3

Current text: In Java, JavaScript, PCRE, and Ruby, \w is always identical to [a-zA-Z0-9_]. However, Ruby 1.9 regular expressions editor (rubular) highlights for \w special characters of European languages (such as , ă, ) and Cyrillic characters as well.

Anonymous  Mar 09, 2012 
PDF Page 65
Last sentence

The last paragraph says "For variable repetition, we use the quantifier <{n,m}>, where n is a POSITIVE NUMBER and m in greater than n....". I think this should read "where n is a NON-NEGATIVE INTEGER etc....", being as n can be 0. Saying that it is an integer (rather than a number) also makes it explicit that it has to be an integer, although I'm sure most people would come to this conclusion regardless of whether the term integer or number is used.

Bryce Thomas  Jan 10, 2010 
Printed Page 249
Regexes

On Page 250, you explained that a trailing line break adds an extra empty line, then how do you deal with a leading line break? If you type "\nabc" in a text editor, you will get TWO lines instead of ONE. Thus I would argue that "\nabc\nabc\nabc\nabc\nabc" also contains SIX lines with an empty first line, but it will be accepted by the given regexes. Am I getting something wrong here? Thanks.

Note from the Author or Editor:
You've gotten it right. You can fix the handling for strings starting with a line break by using \A(?>[^\r\n]*(?>\r\n?|\n)?){0,4}[^\r\n]*\z or equivalent. The necessary changes to the recipe are extensive due to the numerous variations listed.

Yao G.  Sep 03, 2009 
Printed Page 249
3 regular expressions as the solution

They match "\n2\n3\n4\n5\n" but don't match "1\n2\n3\n4\n5\n" (This \n means \r or \r\n or \n). And given "1\n\2\n3\n4\n5\nsome long text", Python version won't respond for some minutes.

Anonymous  Feb 19, 2010 
Other Digital Version 291
"Valdiate the Number" section

In the epub version, after "Visa", none of the credit company names are listed with their displayed formats (as they are in the PDF version on page 275).

Anonymous  Nov 21, 2010 
Other Digital Version 421
Solution

Current regex "[^"\\\r\n]*(?:\\.[^"\\\r\n]*)" finds only one embedded escaped quote but will fail to match the whole string if multiple escaped quotes are present. We believe the correct regex should be "[^"\\\r\n]*(?:\\.[^"\\\r\n]*)*" . Note the extra "*" after the parenthesized expression.

Paul Rubel & Dan Wyschogrod  Aug 29, 2013