You want to match any word that is not immediately followed by the word
ignoring any whitespace, punctuation, or other nonword characters that
appear in between.
Negative lookahead is the secret ingredient for this regular expression:
|Regex options: Case insensitive|
As with many other recipes in this chapter, word boundaries
\b›) and the word
character token (‹
work together to match a complete word. You can find in-depth
descriptions of these features in Recipe 2.6.
(?!⋯)› surrounding the
second part of this regex is a negative lookahead. Lookahead tells the regex engine to temporarily step
forward in the string, to check whether the pattern inside the
lookahead can be matched just ahead of the current position. It does
not consume any of the characters matched inside the lookahead.
Instead, it merely asserts whether a match is possible. Since we’re
using a negative lookahead, the result of the assertion is inverted.
In other words, if the pattern inside the lookahead can be matched
just ahead, the match attempt fails, and regex engine moves forward to
try all over again starting from the next character in the subject
string. You can find much more detail about lookahead (and its
counterpart, lookbehind) in Recipe 2.16 ...