Skip to Content
Specifying a Regular Expression for the Shortest Match
shortcut

Specifying a Regular Expression for the Shortest Match

by David Beazley
April 2024
Beginner
5 pages
2m
English
O'Reilly Media, Inc.
Content preview from Specifying a Regular Expression for the Shortest Match

Specifying a Regular Expression for the Shortest Match

Problem

You’re trying to match a text pattern using regular expressions, but the regular expression is identifying the longest possible matches of a pattern. Instead, you would like to change it to find the shortest possible match.

Solution

This problem often arises in patterns that try to match text enclosed inside a pair of starting and ending delimiters (e.g., a quoted string). To illustrate, consider this example:

>>> str_pat = re.compile(r'\"(.*)\"')
>>> text1 = 'Computer says "no."'
>>> str_pat.findall(text1)
['no.']
>>> text2 = 'Computer says "no." Phone says "yes."'
>>> str_pat.findall(text2)
['no." Phone says "yes.']
>>>

In this example, the pattern r'\"(.*)\"' is attempting to match text enclosed inside double quotes. However, the * operator in a regular expression is greedy, so matching is based on finding the longest possible match. Thus, in the second example involving text2, it incorrectly matches the two quoted strings.

To fix this, add the ? modifier after the * operator in the pattern, like this:

>>> str_pat = re.compile(r'\"(.*?)\"')
>>> str_pat.findall(text2)
['no.', 'yes.']
>>>

This makes the matching nongreedy and instead produces the shortest match.

Discussion

This recipe addresses one of the more common problems encountered ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Searching and Replacing Case-Insensitive Text

Searching and Replacing Case-Insensitive Text

David Beazley

Publisher Resources

ISBN: 9781098171384