You want to validate dates in the traditional formats mm/dd/yy, mm/dd/yyyy, dd/mm/yy, and dd/mm/yyyy. You want to use a simple regex that simply checks whether the input looks like a date, without trying to weed out things such as February 31st.
Match any of these date formats, allowing leading zeros to be omitted:
^[0-3]?[0-9]/[0-3]?[0-9]/(?:[0-9]{2})?[0-9]{2}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Match any of these date formats, requiring leading zeros:
^[0-3][0-9]/[0-3][0-9]/(?:[0-9][0-9])?[0-9][0-9]$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Match m/d/yy and mm/dd/yyyy, allowing any combination of one or two digits for the day and month, and two or four digits for the year:
^(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Match mm/dd/yyyy, requiring leading zeros:
^(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])/[0-9]{4}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Match d/m/yy and dd/mm/yyyy, allowing any combination of one or two digits for the day and month, and two or four digits for the year:
^(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Match dd/mm/yyyy, requiring leading zeros:
^(3[01]|[12][0-9]|0[1-9])/(1[0-2]|0[1-9])/[0-9]{4}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Match any of these date formats with greater accuracy, allowing leading zeros to be omitted:
^(?:(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])|↵ (3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9]))/(?:[0-9]{2})?[0-9]{2}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Match any of these date formats with greater accuracy, requiring leading zeros:
^(?:(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])|↵ (3[01]|[12][0-9]|0[1-9])/(1[0-2]|0[1-9]))/[0-9]{4}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
The free-spacing option makes these last two a bit more readable:
^(?: # m/d or mm/dd (1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9]) | # d/m or dd/mm (3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9]) ) # /yy or /yyyy /(?:[0-9]{2})?[0-9]{2}$
Regex options: Free-spacing |
Regex flavors: .NET, Java, PCRE, Perl, Python, Ruby |
^(?: # mm/dd (1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9]) | # dd/mm (3[01]|[12][0-9]|0[1-9])/(1[0-2]|0[1-9]) ) # /yyyy /[0-9]{4}$
Regex options: Free-spacing |
Regex flavors: .NET, Java, PCRE, Perl, Python, Ruby |
You might think that something as conceptually trivial as a date
should be an easy job for a regular expression. But it isn’t, for two
reasons. Because dates are such an everyday thing, humans are very
sloppy with them. 4/1
may be April Fools’ Day to you. To
somebody else, it may be the first working day of the year, if New
Year’s Day is on a Friday. The solutions shown match some of the most
common date formats.
The other issue is that regular expressions don’t deal directly
with numbers. You can’t tell a regular expression to “match a number
between 1 and 31”, for instance. Regular expressions work character by
character. We use ‹3[01]|[12][0-9]|0?[1-9]
› to match 3 followed by
0 or 1, or to match 1 or 2 followed by any digit, or to match an
optional 0 followed by 1 to 9. In character classes, we can use ranges
for single digits, such as ‹[1-9]
›. That’s because the characters for the
digits 0 through 9 occupy consecutive positions in the ASCII and
Unicode character tables. See Chapter 6 for more
details on matching all kinds of numbers with regular
expressions.
Because of this, you have to choose how simple or how accurate
you want your regular expression to be. If you already know your
subject text doesn’t contain any invalid dates, you could use a
trivial regex such as ‹\d{2}/\d{2}/\d{4}
›. The fact that this matches
things like 99/99/9999
is irrelevant if those don’t
occur in the subject text. You can quickly type in this simple regex,
and it will be quickly executed.
The first two solutions for this recipe are quick and simple,
too, and they also match invalid dates, such as 0/0/00
and 31/31/2008
. They only use
literal characters for the date delimiters, and character classes (see
Recipe 2.3) for the digits and the question
mark (see Recipe 2.12) to make certain digits
optional. ‹(?:[0-9]{2})?[0-9]{2}
› allows the year to
consist of two or four digits. ‹[0-9]{2}
› matches exactly two digits. ‹(?:[0-9]{2})?
› matches zero or two
digits. The noncapturing group (see Recipe 2.9)
is required, because the question mark needs to apply to the character
class and the quantifier
‹{2}
› combined. ‹[0-9]{2}?
› matches exactly two
digits, just like ‹[0-9]{2}
›. Without the
group, the question mark makes the quantifier lazy, which has no
effect because ‹{2}
›
cannot repeat more than two times or fewer than two times.
Solutions 3 through 6 restrict the month to numbers between 1 and 12, and the day to numbers between 1 and 31. We use alternation (see Recipe 2.8) inside a group to match various pairs of digits to form a range of two-digit numbers. We use capturing groups here because you’ll probably want to capture the day and month numbers anyway.
The final two solutions are a little more complex, so we’re presenting these in both condensed and free-spacing form. The only difference between the two forms is readability. JavaScript does not support free-spacing. The final solutions allow all of the date formats, just like the first two examples. The difference is that the last two use an extra level of alternation to restrict the dates to 12/31 and 31/12, disallowing invalid months, such as 31/31.
If you want to search for dates in larger bodies of text instead
of checking whether the input as a whole is a date, you cannot use the
anchors ‹^
› and ‹$
›. Merely removing the anchors
from the regular expression is not the right solution. That would
allow any of these regexes to match 12/12/2001
within 9912/12/200199
, for
example. Instead of anchoring the regex match to the start and end of
the subject, you have to specify that the date cannot be part of
longer sequences of digits.
This is easily done with a pair of word boundaries. In regular
expressions, digits are treated as characters that can be part of
words. Replace both ‹^
›
and ‹$
› with ‹\b
›. As an example:
\b(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])/[0-9]{4}\b
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Get Regular Expressions Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.