Chapter 4. Pattern Matching with Regular Expressions
Introduction
Suppose you have been on the Internet for a few years and have been very faithful about saving all your correspondence, just in case you (or your lawyers, or the prosecution) need a copy. The result is that you have a 50-megabyte disk partition dedicated to saved mail. And let’s further suppose that you remember that there is one letter, somewhere in there, from someone named Angie or Anjie. Or was it Angy? But you don’t remember what you called it or where you stored it. Obviously, you will have to go look for it.
But while some of you go and try to open up all 15,000,000 documents in a word processor, I’ll just find it with one simple command. Any system that provides regular expression support will allow me to search for the pattern:
An[^ dn]
in all the files. The “A” and the “n” match
themselves, in effect finding words that begin with “An”,
while the cryptic [^
dn]
requires the “An” to be followed by a character other
than a space (to eliminate the very common English word
“an” at the start of a sentence) or “d” (to
eliminate the common word “and”) or “n” (to
eliminate Anne, Announcing, etc.). Has your word processor gotten
past its splash screen yet? Well, it doesn’t matter, because
I’ve already found the missing file. To find the answer, I just
typed the
command:[14]
grep 'An[^ dn]' *
Regular expressions, or REs for short, provide a concise and precise specification of patterns to be matched in text. Java ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access