Chapter 9. Regular Expressions

Regular expressions let you specify pattern strings and perform searches and substitutions. Regular expressions are not easy to master, but they can be a powerful tool for processing text. Python offers rich regular expression functionality through the built-in re module.

Regular Expressions and the re Module

A regular expression (RE) is built from a string that represents a pattern. With RE functionality, you can examine any string with the pattern, and see which parts of the string, if any, match the pattern.

The re module supplies Python’s RE functionality. The compile function builds an RE object from a pattern string and optional flags. The methods of an RE object look for matches of the RE in a string or perform substitutions. The re module also exposes functions equivalent to an RE object’s methods, but with the RE’s pattern string as the first argument.

REs can be difficult to master, and this book does not purport to teach them; we cover only the ways in which you can use REs in Python. For general coverage of REs, we recommend the book Mastering Regular Expressions, by Jeffrey Friedl (O’Reilly). Friedl’s book offers thorough coverage of REs at both tutorial and advanced levels. Many tutorials and references on REs can also be found online, including an excellent, detailed tutorial in the online docs. Sites like Pythex and regex101 let you test your REs interactively.

re and bytes Versus Unicode Strings

In v3, by default, REs work in two ...

Get Python in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.