Regular Expressions and the re Module

A regular expression is a string that represents a pattern. With regular expression functionality, you can compare that pattern to another string and see if any part of the string matches the pattern.

The re module supplies all of Python’s regular expression functionality. The compile function builds a regular expression object from a pattern string and optional flags. The methods of a regular expression object look for matches of the regular expression in a string and/or perform substitutions. Module re also exposes functions equivalent to a regular expression’s methods, but with the regular expression’s pattern string as their first argument.

Regular expressions can be difficult to master, and this book does not purport to teach them—I cover only the ways in which you can use them in Python. For general coverage of regular expressions, I recommend the book Mastering Regular Expressions, by Jeffrey Friedl (O’Reilly). Friedl’s book offers thorough coverage of regular expressions at both the tutorial and advanced levels.

Pattern-String Syntax

The pattern string representing a regular expression follows a specific syntax:

  • Alphabetic and numeric characters stand for themselves. A regular expression whose pattern is a string of letters and digits matches the same string.

  • Many alphanumeric characters acquire special meaning in a pattern when they are preceded by a backslash (\).

  • Punctuation works the other way around. A punctuation character is self-matching ...

Get Python in a Nutshell now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.