Chapter 8. Lookarounds

Lookarounds are non-capturing groups that match patterns based on what they find either in front of or behind a pattern. Lookarounds are also considered zero-width assertions.

Lookarounds include:

  • Positive lookaheads

  • Negative lookaheads

  • Positive lookbehinds

  • Negative lookbehinds

In this chapter, I’ll show you how each of these works. We’ll start out using RegExr on the desktop and then move on to Perl and ack (grep doesn’t know about lookarounds). Our text is still Coleridge’s well-worn poem.

Positive Lookaheads

Suppose you want to find every occurrence of the word ancyent that is followed by marinere (I use the archaic spellings because that is what is found in the file). To do this, we could use a positive lookahead.

First let’s try it in RegExr desktop. The following case-insentitive pattern goes in the text box at the top:

(?i)ancyent (?=marinere)

Note

You can also specify case-insensitivity with RegExr by simply checking the box next to ignoreCase, but both methods work.

Because you use the case-insensitive option (?i), you don’t need to worry about what case you use in your pattern. You are looking for every line that has the word ancyent followed hard by marinere. The results will be highlighted in the text area below the pattern area (see Figure 8-1); however, only the first part of the pattern will be highlighted (ancyent), not the lookahead pattern (Marinere).

Positive lookahead in RegExr

Get Introducing Regular Expressions now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.