Chapter 7. String Fundamentals
So far, we’ve studied numbers and explored Python’s dynamic typing model. The next major type on our in-depth object tour is the Python string—an ordered collection of characters used to store and represent text- and bytes-based information. We looked briefly at strings in Chapter 4. Here, we will revisit them to fill in details we skipped earlier.
Before we get started, let’s get clear on what we won’t be covering here. Chapter 4 also briefly previewed Unicode strings and files—tools for dealing with non-ASCII text. Unicode is a key tool for programmers, especially those who work in the internet domain. It can pop up, for example, in web pages, emails, GUI toolkits, file-processing tools, XML and JSON text, and more. At the same time, Unicode can be a heavy topic for programmers just starting out, and a complete understanding of it relies on tools that we haven’t yet studied in full, like files.
In light of that, this book splits its strings coverage between the essentials here, and their extension to Unicode and byte strings in Chapter 37 of its advanced topics part. That is, this chapter tells only part of the string story in Python—the part that most scripts use, and most Python learners need to know up front. Despite this limited scope, everything we learn here will apply directly to Unicode and bytes processing, too, because Python text strings are Unicode, even if they’re simple ASCII text, and byte strings are simply strings constrained to ...