2.17. Match One of Two Alternatives Based on a Condition
Problem
Create a regular expression that matches a comma-delimited list of the words
one
,
two
, and
three
. Each
word can occur any number of times in the list, but each word must
appear at least once.
Solution
\b(?:(?:(one)|(two)|(three))(?:,|\b)){3,}(?(1)|(?!))(?(2)|(?!))(?(3)|(?!))
Regex options: None |
Regex flavors: .NET, JavaScript, PCRE, Perl, Python |
Java and Ruby do not support conditionals. When programming in Java or Ruby (or any other language), you can use the regular expression without the conditionals, and write some extra code to check if each of the three capturing groups matched something.
\b(?:(?:(one)|(two)|(three))(?:,|\b)){3,}
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Discussion
.NET, JavaScript, PCRE, Perl, and Python support
conditionals using numbered capturing groups.
‹(?(1)
›
is a conditional that checks whether the first capturing group has
already matched something. If it has, the regex engine attempts to
match ‹then
|else
)
›. If the
capturing group has not participated in the match attempt thus far,
the ‹then
› part is
attempted.else
The parentheses, question mark, and vertical bar are all part of
the syntax for the conditional. They don’t have their usual meaning.
You can use any kind of regular expression for the ‹
›
and ‹then
› parts. The only restriction is that if you want to use alternation for one of the parts, you have to use a group to keep it together. Only one ...else
Get Regular Expressions Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.