5.14. Escape Regular Expression Metacharacters

Problem

You want to use a literal string provided by a user or from some other source as part of a regular expression. However, you want to escape all regular expression metacharacters within the string before embedding it in your regex, to avoid any unintended consequences.

Solution

By adding a backslash before any characters that potentially have special meaning within a regular expression, you can safely use the resulting pattern to match a literal sequence of characters. Of the programming languages covered by this book, all except JavaScript have a built-in function or method to perform this task (listed in Table 5-3). However, for the sake of completeness, we’ll show how to pull this off using your own regex, even in the languages that have a ready-made solution.

Built-in solutions

Table 5-3 lists the built-in functions and methods designed to solve this problem.

Table 5-3. Built-in solutions for escaping regular expression metacharacters

Language

Function

C#, VB.NET

Regex.Escape(str)

Java

Pattern.quote(str)

XRegExp

XRegExp.escape(str)

Perl

quotemeta(str)

PHP

preg_quote(str, [delimiter])

Python

re.escape(str)

Ruby

Regexp.escape(str)

Notably absent from the list is JavaScript (without XRegExp), which does not have a native function designed for this purpose.

Regular expression

Although it’s best to use a built-in solution if available, you can pull this off on your own by using the following regular expression along with the appropriate replacement ...

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.