Chapter 11. Strings with stringr

Introduction

This chapter introduces you to string manipulation in R. You’ll learn the basics of how strings work and how to create them by hand, but the focus of this chapter will be on regular expressions, or regexps for short. Regular expressions are useful because strings usually contain unstructured or semi-structured data, and regexps are a concise language for describing patterns in strings. When you first look at a regexp, you’ll think a cat walked across your keyboard, but as your understanding improves they will soon start to make sense.

Prerequisites

This chapter will focus on the stringr package for string manipulation. stringr is not part of the core tidyverse because you don’t always have textual data, so we need to load it explicitly.

library(tidyverse)
library(stringr)

String Basics

You can create strings with either single quotes or double quotes. Unlike other languages, there is no difference in behavior. I recommend always using ", unless you want to create a string that contains multiple ":

string1 <- "This is a string"
string2 <- 'To put a "quote" inside a string, use single quotes'

If you forget to close a quote, you’ll see +, the continuation character:

> "This is a string without a closing quote
+
+
+ HELP I'M STUCK

If this happens to you, press Esc and try again!

To include a literal single or double quote in a string you can use \ to “escape” it:

double_quote <- "\"" # or '"'
single_quote <- '\'' # or ...

Get R for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.