Regular Expressions

When dealing with string-manipulation functions in programming languages, the notion of regular expressions sometimes arises. In R, you must pay attention to this point when using the string functions grep(), grepl(), regexpr(), gregexpr(), sub(), gsub(), and strsplit().

A regular expression is a kind of wild card. It’s shorthand to specify broad classes of strings. For example, the expression "[au]" refers to any string that contains either of the letters a or u. You could use it like this:

> grep("[au]",c("Equator","North Pole","South Pole"))
[1] 1 3

This reports that elements 1 and 3 of ("Equator","North Pole","South Pole")—that is, “Equator” and “South Pole”—contain either an a or a u.

A period (.) represents any single ...

Get The Art of R Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.