Skip to Content
Regular Expressions Cookbook, 2nd Edition
book

Regular Expressions Cookbook, 2nd Edition

by Jan Goyvaerts, Steven Levithan
August 2012
Intermediate to advanced
609 pages
19h 16m
English
O'Reilly Media, Inc.
Content preview from Regular Expressions Cookbook, 2nd Edition

2.16. Test for a Match Without Adding It to the Overall Match

Problem

Find any word that occurs between a pair of HTML bold tags, without including the tags in the regex match. For instance, if the subject is My <b>cat</b> is furry, the only valid match should be cat.

Solution

(?<=<b>)\w+(?=</b>)
Regex options: Case insensitive
Regex flavors: .NET, Java, PCRE, Perl, Python, Ruby 1.9

JavaScript and Ruby 1.8 support the lookahead (?=</b>), but not the lookbehind (?<=<b>).

Discussion

Lookaround

The four kinds of lookaround groups supported by modern regex flavors have the special property of giving up the text matched by the part of the regex inside the lookaround. Essentially, lookaround checks whether certain text can be matched without actually matching it.

Lookaround that looks backward is called lookbehind. This is the only regular expression construct that will traverse the text from right to left instead of from left to right. The syntax for positive lookbehind is (?<=). The four characters (?<= form the opening bracket. What you can put inside the lookbehind, here represented by , varies among regular expression flavors. But simple literal text, such as (?<=<b>), always works.

Lookbehind checks to see whether the text inside the lookbehind occurs immediately to the left of the position that the regular expression engine has reached. If you match (?<=<b>) against My <b>cat</b> is furry, the lookbehind will fail to match until the regular expression starts the match attempt ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Regular Expressions Cookbook

Regular Expressions Cookbook

Jan Goyvaerts, Steven Levithan

Publisher Resources

ISBN: 9781449327453Supplemental ContentErrata Page