Regular Expressions

A regular expression (also known as a regexp or regex) describes a textual pattern. Ruby’s Regexp class[*] implements regular expressions, and both Regexp and String define pattern matching methods and operators. Like most languages that support regular expressions, Ruby’s Regexp syntax follows closely (but not precisely) the syntax of Perl 5.

Regexp Literals

Regular expression literals are delimited by forward slash characters:

/Ruby?/  # Matches the text "Rub" followed by an optional "y"

The closing slash character isn’t a true delimiter because a regular expression literal may be followed by one or more optional flag characters that specify additional information about the how pattern matching is to be done. For example:

/ruby?/i  # Case-insensitive: matches "ruby" or "RUB", etc.
/./mu     # Matches Unicode characters in Multiline mode

The allowed modifier characters are shown in Table 9-1.

Table 9-1. Regular expression modifier characters

ModifierDescription
i Ignore case when matching text.
m The pattern is to be matched against multiline text, so treat newline as an ordinary character: allow . to match newlines.
x Extended syntax: allow whitespace and comments in regexp.
o

Perform #{} interpolations only once, the first time the regexp literal is evaluated.

u,e,s,n

Interpret the regexp as Unicode (UTF-8), EUC, SJIS, or ASCII. If none of these modifiers is specified, the regular expression is assumed to use the source encoding.

Like string literals delimited with %Q, Ruby ...

Get The Ruby Programming Language now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.