The re Pattern-Matching Module

The re module is the standard regular expression-matching interface. Regular expression (RE) patterns are specified as strings. This module must be imported.

Module Functions

compile(pattern [, flags])

Compile an RE pattern string into a regular expression object, for later matching. flags (combinable by bitwise | operator) include the following available at the top-level of the re module:

A or ASCII or (?a)

Makes \w, \W, \b, \B, \s, and \S perform ASCII-only matching instead of full Unicode matching. This is only meaningful for Unicode patterns and is ignored for byte patterns. Note that for backward compatibility, the re.U flag still exists (as well as its synonym re.UNICODE and its embedded counterpart, ?u), but these are redundant in Python 3.0 since matches are Unicode by default for strings (and Unicode matching isn’t allowed for bytes).

I or IGNORECASE or (?i)

Case-insensitive matching.

L or LOCALE or (?L)

Makes \w, \W, \b, \B, \s, \S, \d, and \D dependent on the current locale (default is Unicode for Python 3).

M or MULTILINE or (?m)

Matches to each newline, not whole string.

S or DOTALL or (?s)

. matches all characters, including newline.

U or UNICODE or (?u)

Makes \w, \W, \b, \B, \s, \S, \d, and \D dependent on Unicode character properties (new in version 2.0, and superfluous in Python 3).

X or VERBOSE or (?x)

Ignores whitespace in the pattern, outside character sets.

match(pattern, string [, flags])

If zero or more characters at start of string match the ...

Get Python Pocket Reference, 4th Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.