2.11. Capture and Name Parts of the Match

Problem

Create a regular expression that matches any date in yyyy-mm-dd format and separately captures the year, month, and day. The goal is to make it easy to work with these separate values in the code that processes the match. Contribute to this goal by assigning the descriptive names “year,” “month,” and “day” to the captured text.

Create another regular expression that matches “magical” dates in yyyy-mm-dd format. A date is magical if the year minus the century, the month, and the day of the month are all the same numbers. For example, 2008-08-08 is a magical date. Capture the magical number (08 in the example), and label it “magic.”

You can assume all dates in the subject text to be valid. The regular expressions don’t have to exclude things like 9999-99-99, because these won’t occur in the subject text.

Solution

Named capture

\b(?<year>\d\d\d\d)-(?<month>\d\d)-(?<day>\d\d)\b
Regex options: None
Regex flavors: .NET, Java 7, XRegExp, PCRE 7, Perl 5.10, Ruby 1.9
\b(?'year'\d\d\d\d)-(?'month'\d\d)-(?'day'\d\d)\b
Regex options: None
Regex flavors: .NET, PCRE 7, Perl 5.10, Ruby 1.9
\b(?P<year>\d\d\d\d)-(?P<month>\d\d)-(?P<day>\d\d)\b
Regex options: None
Regex flavors: PCRE 4 and later, Perl 5.10, Python

Named backreferences

\b\d\d(?<magic>\d\d)-\k<magic>-\k<magic>\b
Regex options: None
Regex flavors: .NET, Java 7, XRegExp, PCRE 7, Perl 5.10, Ruby 1.9
\b\d\d(?'magic'\d\d)-\k'magic'-\k'magic'\b
Regex options: None
Regex flavors: .NET, PCRE 7, Perl 5.10, Ruby ...

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.