Chapter 11. Strings with stringr


This chapter introduces you to string manipulation in R. You’ll learn the basics of how strings work and how to create them by hand, but the focus of this chapter will be on regular expressions, or regexps for short. Regular expressions are useful because strings usually contain unstructured or semi-structured data, and regexps are a concise language for describing patterns in strings. When you first look at a regexp, you’ll think a cat walked across your keyboard, but as your understanding improves they will soon start to make sense.


This chapter will focus on the stringr package for string manipulation. stringr is not part of the core tidyverse because you don’t always have textual data, so we need to load it explicitly.


String Basics

You can create strings with either single quotes or double quotes. Unlike other languages, there is no difference in behavior. I recommend always using ", unless you want to create a string that contains multiple ":

string1 <- "This is a string"
string2 <- 'To put a "quote" inside a string, use single quotes'

If you forget to close a quote, you’ll see +, the continuation character:

> "This is a string without a closing quote

If this happens to you, press Esc and try again!

To include a literal single or double quote in a string you can use \ to “escape” it:

double_quote <- "\"" # or '"'
single_quote <- '\'' # or "'"

That means ...

Get R for Data Science now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.