
114
|
Python Pocket Reference
Pattern Syntax
Pattern strings are specified by concatenating forms (see
Table 19) as well as by character class escapes (see Table 20).
Python character escapes (e.g.,
\t for tab) can also appear.
Pattern strings are matched against text strings, yielding a
Boolean match result, as well as grouped substrings matched
by subpatterns in parentheses:
>>> import re
>>> patt = re.compile('hello[ \t]*(.*)')
>>> mobj = patt.match('hello world!')
>>> mobj.group(1)
'world!'
In Table 19, C is any character, R is any regular expression
form in the left column of the table, and m and n are inte-
gers. Each form usually consumes as much of the string
being matched as possible, except for the nongreedy forms
(which consume as little as possible, as long as the entire pat-
tern still matches the target string).
Table 19. Regular expression pattern syntax
Form Description
. Matches any character (including newline if DOTALL flag is
specified).
^ Matches start of string (of every line in MULTILINE mode).
$ Matches end of string (of every line in MULTILINE mode).
C Any nonspecial character matches itself.
R* Zero or more occurrences of preceding regular expression R (as
many as possible).
R+ One or more occurrences of preceding regular expression R (as
many as possible).
R? Zero or one occurrence of preceding regular expression R.
R{m,n} Matches from m to n repetitions of preceding regular expression ...