O'Reilly logo

Computer Science & Perl Programming by Jon Orwant

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 17. How Regexes Work

Mark Jason Dominus

This isn’t an article about how to use regexes; you’ve seen plenty of those already. It’s about how you would write a regex package from scratch, in a language like C that didn’t already have one. I demonstrate a new module, Regex.pm, which implements regexes from nothing, in Perl. This will give you an idea of how regex matching is possible, although the details differ rather substantially from what Perl actually does.

Here’s the basic strategy. We’ll see a kind of “machine” that reads some input, one character at a time, and then, depending on what’s in the input and on the various wheels and gears in the machine, says either “yes” or “no”. The machines are simple, and it turns out that if we have a regex, it’s not hard to construct a machine that says “yes” for exactly those strings that match the regex, and “no” for all the other strings.

When our program wants to see if S matched /R/, it’ll do something like this:

  • Look at R.

  • Construct the machine that corresponds to R.

  • Feed S into the machine.

  • If the machine says “yes”, then S matched /R/. (Otherwise, it didn’t.)

Maybe this sounds bizarre, but it’s what Perl does. If you can follow what happens in this article, you’ll know what Perl is really up to when it performs a regex match.

Machines

We’re on a tight budget here, so our machines will be made of circles and arrows instead of wheels and gears. This diagram shows a machine.

Figure 17-1. 

Let’s see if this machine says “yes” to the string ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required