5.7. Find Words Near Each Other
Problem
You want to emulate a NEAR search using a regular expression. For readers
unfamiliar with the term, some search tools that use Boolean operators
such as NOT and OR also have a special operator called NEAR. Searching
for “word1 NEAR word2” finds word1
and word2
in any order, as long as they occur
within a certain distance of each other.
Solution
If you’re only searching for two different words, you can
combine two regular expressions—one that matches word1
before word2
, and another that
flips the order of the words. The following regex allows up to five
words to separate the two you’re searching for:
\b(?:word1\W+(?:\w+\W+){0,5}?word2|word2\W+(?:\w+\W+){0,5}?word1)\b
Regex options: Case insensitive |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
\b(?: word1 # first term \W+ (?:\w+\W+){0,5}? # up to five words word2 # second term | # or, the same pattern in reverse... word2 # second term \W+ (?:\w+\W+){0,5}? # up to five words word1 # first term )\b
Regex options: Free-spacing, case insensitive |
Regex flavors: .NET, Java, PCRE, Perl, Python, Ruby |
The second regular expression here uses the free-spacing option and adds whitespace and comments for readability. Apart from that, the two regular expressions are identical. JavaScript doesn’t support free-spacing mode, but the other listed regex flavors allow you to take your pick. Recipes 3.5 and 3.7 show examples of how you can add these regular expressions to your search form or other code. ...
Get Regular Expressions Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.