Regexp Special Characters

The metacharacters +, *, ?, and { } affect the number of times a pattern should be matched, () allows you to create subpatterns, and $ and ^ affect the position. + means "Match one or more of the previous expression," * means "Match zero or more of the previous expression," and ? means "Match zero or one of the previous expression." For example:

    preg_match("/[A-Za-z ]*/", $string);
    // matches "", "a", "aaaa", "The sun has got his hat on", etc

    preg_match("/-?[0-9]+/", $string);
    // matches 1, 100, 324343995, and also -1, -234011, etc. The "-?" means "match exactly
     0 or 1 minus symbols"

This next regexp shows two character classes, with the first being required and the second optional. As mentioned before, $ is a regexp symbol in its own right; however, here we precede it with a backslash, which works as an escape character, turning the $ into a standard character and not a regexp symbol. We match precisely one symbol from the range A-Z, a-z, and _, then match zero or more symbols from the range A-Z, a-z, underscore, and 0-9. If you're able to parse this in your head, you will see that this regexp will match PHP variable names:

    preg_match("/\$[A-Za-z_][A-Za-z_0-9]*/", $string);

Table 15-3 shows a list of regular expressions using +, *, and ?, and whether or not a match is made.

Table 15-3. Regular expressions using +, *, and ?

Regexp

Result

preg_match("/[A-Z]+/", "123")

False

preg_match("/[A-Z][A-Z0-9]+/i", "A123")

True

preg_match("/[0-9]?[A-Z]+/", ...

Get PHP in a Nutshell now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.