2.18. Add Comments to a Regular Expression
Problem
‹\d{4}-\d{2}-\d{2}
› matches a date in yyyy-mm-dd
format, without doing any validation of the numbers. Such a simple
regular expression is appropriate when you know your data does not
contain any invalid dates. Add comments to this regular expression to
indicate what each part of the regular expression does.
Solution
\d{4} # Year - # Separator \d{2} # Month - # Separator \d{2} # Day
Regex options: Free-spacing |
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby |
Discussion
Free-spacing mode
Regular expressions can quickly become complicated and difficult to understand. Just as you should comment source code, you should comment all but the most trivial regular expressions.
All regular expression flavors in this book, except JavaScript, offer an alternative regular expression syntax that makes it very easy to clearly comment your regular expressions. You can enable this syntax by turning on the free-spacing option. It has different names in various programming languages.
In .NET, set the RegexOptions.IgnorePatternWhitespace
option. In
Java, pass the Pattern.COMMENTS
flag. Python expects
re.VERBOSE
. PHP, Perl, and Ruby use the
/x
flag.
Though standard JavaScript does not support free-spacing regular
expressions, the XRegExp library adds that option. Simply add 'x'
to the flags passed as the
second parameter to the XRegExp()
constructor.
Turning on free-spacing mode has two effects. It turns the hash symbol (#
) into a metacharacter, outside ...
Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.