Quantifiers
The characters and character classes we’ve talked about all match single
characters. We mentioned that you could match multiple “word” characters
with \w+. The + is one kind of quantifier, but there are
others. All of them are placed after the item being quantified.
The most general form of quantifier specifies both the minimum and
maximum number of times an item can match. You put the two numbers in
braces, separated by a comma. For example, if you were trying to match
North American phone numbers, the sequence \d{7,11} would match at least seven digits,
but no more than eleven digits. If you put a single number in the
braces, the number specifies both the minimum and the maximum; that is,
the number specifies the exact number of times the item can match. (All
unquantified items have an implicit {1} quantifier.)
If you put the minimum and the comma but omit the maximum, then
the maximum is taken to be infinity. In other words, it will match at
least the minimum number of times, plus as many as it can get after
that. For example, \d{7} will match
only the first seven digits (a local North American phone number, for
instance, or the first seven digits of a longer number), while \d{7,} will match any phone number, even an
international one (unless it happens to be shorter than seven digits).
There is no special way of saying “at most” a certain number of times.
Just say .{0,5}, for example, to find
at most five arbitrary characters.
Certain combinations of minimum and maximum ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access