O'Reilly logo

Regular Expressions Cookbook, 2nd Edition by Steven Levithan, Jan Goyvaerts

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

4.14. Validate ZIP Codes

Problem

You need to validate a ZIP code (U.S. postal code), allowing both the five-digit and nine-digit (called ZIP+4) formats. The regex should match 12345 and 12345-6789, but not 1234, 123456, 123456789, or 1234-56789.

Solution

Regular expression

^[0-9]{5}(?:-[0-9]{4})?$
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

VB.NET example

If Regex.IsMatch(subjectString, "^[0-9]{5}(?:-[0-9]{4})?$") Then
    Console.WriteLine("Valid ZIP code")
Else
    Console.WriteLine("Invalid ZIP code")
End If

See Recipe 3.6 for help with implementing this regular expression with other programming languages.

Discussion

A breakdown of the ZIP code regular expression follows:

^           # Assert position at the beginning of the string.
[0-9]{5}    # Match a digit, exactly five times.
(?:         # Group but don't capture:
  -         #   Match a literal "-".
  [0-9]{4}  #   Match a digit, exactly four times.
)           # End the noncapturing group.
  ?         #   Make the group optional.
$           # Assert position at the end of the string.
Regex options: Free-spacing
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby

This regex is pretty straightforward, so there isn’t much to add. A simple change that would allow you to find ZIP codes within a longer input string is to replace the ^ and $ anchors with word boundaries, so you end up with \b[0-9]{5}(?:-[0-9]{4})?\b.

Tip

There is one valid ZIP+4 code that this regex will not match: 10022-SHOE. This is the only ZIP code that includes letters. In 2007, it was assigned specifically to the eighth floor Saks Fifth Avenue shoe store in New York, New York. At least thus far, however, the U.S. Postal Service has not created any other vanity ZIP codes. Mail addressed to ZIP code 10022 will still reach the shoe store (and pass validation) just fine, so we don’t think it’s worthwhile to modify the regex to shoehorn in this sole exception.

See Also

For people who deal with non-U.S. addresses, we’ve covered Canadian postal codes in Recipe 4.15, and U.K. postcodes in Recipe 4.16.

Recipe 4.17 shows how to determine whether something looks like a P.O. box address, for cases where you need to treat P.O. boxes differently than normal street addresses.

You can look up cities by ZIP code, or ZIP codes by city and state or address, at https://www.usps.com/zip4/. However, ZIP codes actually correspond to mail delivery paths rather than specific geographic locations, so there are many unusual cases including ZIP codes that cross state boundaries or that service military vessels, specific corporate buildings, or P.O. boxes.

Techniques used in the regular expressions in this recipe are discussed in Chapter 2. Recipe 2.3 explains character classes. Recipe 2.5 explains anchors. Recipe 2.9 explains grouping. Recipe 2.12 explains repetition.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required