Chapter 17. Strings and Text
The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information.
Alan Perlis, epigram #34
Weâve been using Rustâs main textual types, String
, str
, and char
, throughout the book. In âString Typesâ, we described the syntax for character and string literals and showed how strings are represented in memory. In this chapter, we cover text handling in more detail.
In this chapter:
-
We give you some background on Unicode that should help you make sense of the standard libraryâs design.
-
We describe the
char
type, representing a single Unicode code point. -
We describe the
String
andstr
types, representing owned and borrowed sequences of Unicode characters. These have a broad variety of methods for building, searching, modifying, and iterating over their contents. -
We cover Rustâs string formatting facilities, like the
println!
andformat!
macros. You can write your own macros that work with formatting strings and extend them to support your own types. -
We give an overview of Rustâs regular expression support.
-
Finally, we talk about why Unicode normalization matters and show how to do it in Rust.
Some Unicode Background
This book is about Rust, not Unicode, which has entire books devoted to it already. But Rustâs character and string types are designed around Unicode. Here are a few bits of Unicode that help explain Rust.
ASCII, Latin-1, and Unicode ...
Get Programming Rust, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.