Regular Expressions

Let’s look a little more closely at the pattern matching we have been doing. This has been achieved using regular expressions, which are supported by both JavaScript and PHP. They make it possible to construct the most powerful of pattern-matching algorithms within a single expression.

Matching Through Metacharacters

Every regular expression must be enclosed in slashes. Within these slashes, certain characters have special meanings; they are called metacharacters. For instance, an asterisk (*) has a meaning similar to what you have seen if you use a shell or Windows Command prompt (but not quite the same). An asterisk means, “the text you’re trying to match may have any number of the preceding character—or none at all.”

For instance, let’s say you’re looking for the name “Le Guin” and know that someone might spell it with or without a space. Because the text is laid out strangely (for instance, someone may have inserted extra spaces to right-justify lines), you could have to search for a line such as:

The   difficulty  of   classifying Le      Guin's    works

So you need to match “LeGuin,” as well as “Le” and “Guin” separated by any number of spaces. The solution is to follow a space with an asterisk:

/Le *Guin/

There’s a lot more than the name “Le Guin” in the line, but that’s OK. As long as the regular expression matches some part of the line, the test function returns a true value. What if it’s important to make sure the line contains nothing but “Le Guin”? I’ll show how to ensure ...

Get Learning PHP, MySQL, and JavaScript now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.