O'Reilly logo

Regular Expressions Cookbook, 2nd Edition by Steven Levithan, Jan Goyvaerts

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

4.17. Find Addresses with Post Office Boxes

Problem

You want to catch addresses that contain a P.O. box, and warn users that their shipping information must contain a street address.

Solution

Regular expression

^(?:Post(?:al)?(?:Office)?|P[.]?O\.?)?Box\b
Regex options: Case insensitive, ^ and $ match at line breaks
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

C# example

Regex regexObj = new Regex(
    @"^(?:Post(?:al)? (?:Office )?|P[. ]?O\.? )?Box\b",
    RegexOptions.IgnoreCase | RegexOptions.Multiline
);
if (regexObj.IsMatch(subjectString) {
    Console.WriteLine("The value does not appear to be a street address");
} else {
    Console.WriteLine("Good to go");
}

See Recipe 3.5 for help with running a regular expression match test like this with other programming languages. Recipe 3.4 explains how to set the regex options used here.

Discussion

The following explanation is written in free-spacing mode, so each of the meaningful space characters in the regex has been escaped with a backslash:

^                # Assert position at the beginning of a line.
(?:              # Group but don't capture:
  Post(?:al)?\   #   Match "Post " or "Postal ".
  (?:Office\ )?  #   Optionally match "Office ".
 |               #  Or:
  P[.\ ]?        #   Match "P" and an optional period or space character.
  O\.?\          #   Match "O", an optional period, and a space character.
)?               # Make the group optional.
Box              # Match "Box".
\b               # Assert position at a word boundary.
Regex options: Case insensitive, ^ and $ match at line breaks, free-spacing
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby

This regular expression matches all of the following example strings when they appear at the beginning of a line:

  • Post Office Box

  • Postal Box

  • post box

  • P.O. box

  • P O Box

  • Po. box

  • PO Box

  • Box

Despite the precautions taken here, you might encounter a few false positives or false negatives because many people are used to shippers being flexible in how they decipher addresses. To mitigate this risk, it’s best to state up front that P.O. boxes are not allowed. If you get a match using this regular expression, consider warning users that it appears they have entered a P.O. box, while still providing the option to keep the entry.

See Also

Recipes 4.14, 4.15, and 4.16 show how to validate U.S., Canadian, and U.K. postal codes, respectively.

Techniques used in the regular expressions in this recipe are discussed in Chapter 2. Recipe 2.3 explains character classes. Recipe 2.5 explains anchors. Recipe 2.6 explains word boundaries. Recipe 2.8 explains alternation. Recipe 2.9 explains grouping. Recipe 2.12 explains repetition.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required