13.8. Escaping Special Characters in a Regular Expression
Problem
You want to have characters
such as * or + treated as
literals, not as metacharacters, inside a regular expression. This is
useful when allowing users to type in search strings you want to use
inside a regular expression.
Solution
Use preg_quote( )
to escape Perl-compatible regular-expression
metacharacters:
$pattern = preg_quote('The Education of H*Y*M*A*N K*A*P*L*A*N').':(\d+)';
if (preg_match("/$pattern/",$book_rank,$matches)) {
print "Leo Rosten's book ranked: ".$matches[1];
}Use quotemeta( ) to escape POSIX metacharacters:
$pattern = quotemeta('M*A*S*H').':[0-9]+';
if (ereg($pattern,$tv_show_rank,$matches)) {
print 'Radar, Hot Lips, and the gang ranked: '.$matches[1];
}Discussion
Here are the characters that preg_quote( ) escapes:
. \ + * ? ^ $ [ ] ( ) { } < > = ! | :Here are the characters that quotemeta( )
escapes:
. \ + * ? ^ $ [ ] ( )
These functions escape the metacharacters with backslash.
The quotemeta( ) function doesn’t
match all POSIX metacharacters. The characters {,
}, and | are also valid
metacharacters but aren’t converted. This is another
good reason to use preg_match( ) instead of
ereg( ).
You can also pass preg_quote( ) an additional
character to escape as a second argument. It’s
useful to pass your pattern delimiter (usually /)
as this argument so it also gets escaped. This is important if you
incorporate user input into a regular-expression pattern. The
following code expects $_REQUEST['search_term'] ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access