2.21. Insert Part of the Regex Match into the Replacement Text

Problem

Match any contiguous sequence of 10 digits, such as 1234567890. Convert the sequence into a nicely formatted phone number—for example, (123) 456-7890.

Solution

Regular expression

\b(\d{3})(\d{3})(\d{4})\b
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Replacement

($1)$2-$3
Replacement text flavors: .NET, Java, JavaScript, PHP, Perl
(${1})${2}-${3}
Replacement text flavors: .NET, PHP, Perl
(\1)\2-\3
Replacement text flavors: PHP, Python, Ruby

Discussion

Replacements using capturing groups

Recipe 2.10 explains how you can use capturing groups in your regular expression to match the same text more than once. The text matched by each capturing group in your regex is also available after each successful match. You can insert the text of some or all capturing groups—in any order, or even more than once—into the replacement text.

Some flavors, such as Python and Ruby, use the same «\1» syntax for backreferences in both the regular expression and the replacement text. Other flavors use Perl’s «$1» syntax, using a dollar sign instead of a backslash. PHP supports both.

In Perl, «$1» and above are actually variables that are set after each successful regex match. You can use them anywhere in your code until the next regex match. .NET, Java, JavaScript, and PHP support «$1» only in the replacement syntax. These programming languages do offer other ways to access capturing groups in code. Chapter 3

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.