2.18. Add Comments to a Regular Expression

Problem

\d{4}-\d{2}-\d{2} matches a date in yyyy-mm-dd format, without doing any validation of the numbers. Such a simple regular expression is appropriate when you know your data does not contain any invalid dates. Add comments to this regular expression to indicate what each part of the regular expression does.

Solution

\d{4}    # Year
-        # Separator
\d{2}    # Month
-        # Separator
\d{2}    # Day
Regex options: Free-spacing
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby

Discussion

Free-spacing mode

Regular expressions can quickly become complicated and difficult to understand. Just as you should comment source code, you should comment all but the most trivial regular expressions.

All regular expression flavors in this book, except JavaScript, offer an alternative regular expression syntax that makes it very easy to clearly comment your regular expressions. You can enable this syntax by turning on the free-spacing option. It has different names in various programming languages.

In .NET, set the RegexOptions.IgnorePatternWhitespace option. In Java, pass the Pattern.COMMENTS flag. Python expects re.VERBOSE. PHP, Perl, and Ruby use the /x flag.

Though standard JavaScript does not support free-spacing regular expressions, the XRegExp library adds that option. Simply add 'x' to the flags passed as the second parameter to the XRegExp() constructor.

Turning on free-spacing mode has two effects. It turns the hash symbol (#) into a metacharacter, outside ...

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.