Regexp Special Characters
The metacharacters
+
, *
, ?
, and { }
affect the number of times a pattern should be matched, ()
allows you to create subpatterns, and $
and ^
affect the position. +
means "Match one or more of the previous expression," * means "Match zero or more of the previous expression," and ?
means "Match zero or one of the previous expression." For example:
preg_match("/[A-Za-z ]*/", $string); // matches "", "a", "aaaa", "The sun has got his hat on", etc preg_match("/-?[0-9]+/", $string); // matches 1, 100, 324343995, and also -1, -234011, etc. The "-?" means "match exactly 0 or 1 minus symbols"
This next regexp shows two character classes, with the first being required and the second optional. As mentioned before, $
is a regexp symbol in its own right; however, here we precede it with a backslash, which works as an escape character, turning the $ into a standard character and not a regexp symbol. We match precisely one symbol from the range A
-Z
, a
-z
, and _
, then match zero or more symbols from the range A-Z
, a
-z
, underscore, and 0
-9
. If you're able to parse this in your head, you will see that this regexp will match PHP variable names:
preg_match("/\$[A-Za-z_][A-Za-z_0-9]*/", $string);
Table 15-3 shows a list of regular expressions using +
, *
, and ?
, and whether or not a match is made.
Table 15-3. Regular expressions using +, *, and ?
Regexp |
Result |
---|---|
|
False |
|
True |
|
Get PHP in a Nutshell now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.