Regular Expressions and the re Module
A regular expression is a string that represents a pattern. With regular expression functionality, you can compare that pattern to another string and see if any part of the string matches the pattern.
The re module
supplies all of Python’s regular expression
functionality. The compile function builds a
regular expression object from a pattern string and optional flags.
The methods of a regular expression object look for matches of the
regular expression in a string and/or perform substitutions. Module
re also exposes functions equivalent to a regular
expression’s methods, but with the regular
expression’s pattern string as their first argument.
Regular expressions can be difficult to master, and this book does not purport to teach them—I cover only the ways in which you can use them in Python. For general coverage of regular expressions, I recommend the book Mastering Regular Expressions, by Jeffrey Friedl (O’Reilly). Friedl’s book offers thorough coverage of regular expressions at both the tutorial and advanced levels.
Pattern-String Syntax
The pattern string representing a regular expression follows a specific syntax:
Alphabetic and numeric characters stand for themselves. A regular expression whose pattern is a string of letters and digits matches the same string.
Many alphanumeric characters acquire special meaning in a pattern when they are preceded by a backslash (
\).Punctuation works the other way around. A punctuation character is self-matching ...