Matching from Where the Last Pattern Left Off
Problem
You want to match again from where the last pattern left off.
This is a useful approach to take when repeatedly extracting data in chunks from a string.
Solution
Use a combination of the /g
match modifier, the
\G
pattern anchor, and the pos
function.
Discussion
If you use the /g
modifier on a match, the
regular expression engine keeps track of its position in the string
when it finished matching. The next time you match with
/g
, the engine starts looking for a match from
this remembered position. This lets you use a
while
loop to extract the information you want
from the string.
while (/(\d+)/g) { print "Found $1\n"; }
You can also use \G
in your pattern to anchor it
to the end of the previous match. For example, if you had a number
stored in a string with leading blanks, you could change each leading
blank into the digit zero this way:
$n = " 49 here";
$n =~ s/\G /0/g;
print $n;
00049 here
You can also make good use of \G
in a
while
loop. Here we use \G
to
parse a comma-separated list of numbers (e.g.,
"3,4,5,9,120"
):
while (/\G,?(\d+)/g) { print "Found number $1\n"; }
By default, when your match fails (when we run out of numbers in the
examples, for instance) the remembered position is reset to the
start. If you don’t want this to happen, perhaps because you
want to continue matching from that position but with a different
pattern, use the modifier /c
with
/g
:
$_ = "The year 1752 lost 10 days on the 3rd of September"; while (/(\d+)/gc) ...
Get Perl Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.