Chapter 6. Numbers

Regular expressions are designed to deal with text, and don’t understand the numerical meanings that humans assign to strings of digits. To a regular expression, 56 is not the number fifty-six, but a string consisting of two characters displayed as the digits 5 and 6. The regex engine knows they’re digits, because the shorthand character class \d matches them (see Recipe 2.3). But that’s it. It doesn’t know that 56 has a higher meaning, just as it doesn’t know that :-) is anything but three punctuation characters matched by \p{P}{3}.

But numbers are some of the most important input you’re likely to deal with, and sometimes you need to process them inside a regular expression instead of just passing them to a conventional programming language when you want to answer questions such as, “Is this number within the range 1 through 100?” So we’ve devoted a whole chapter to matching all kinds of numbers with regular expressions. We start off with a few recipes that may seem trivial, but actually explain important basic concepts. The later recipes that deal with more complicated regexes assume you grasp these basic concepts.

6.1. Integer Numbers

Problem

You want to find various kinds of integer decimal numbers in a larger body of text, or check whether a string variable holds an integer decimal number.

Solution

Find any positive integer decimal number in a larger body of text:

\b[0-9]+\b
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Check ...

Get Regular Expressions Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.