Chapter 37. Unicode and Byte Strings
So far, our exploration of strings in this book has been deliberately incomplete. Chapter 4âs types preview briefly introduced Pythonâs Unicode strings and files without giving many details, and the strings chapter in the core types part of this book (Chapter 7) deliberately limited its scope to the subset of string topics that most Python programmers need to know about.
This was by design: because many programmers, including most
beginners, deal with simple forms of text like ASCII, they can happily work
with Pythonâs basic str
string type and
its associated operations and donât need to come to grips with more advanced
string concepts. In fact, such programmers can often ignore the string
changes in Python 3.X and continue to use strings as they may have in the
past.
On the other hand, many other programmers deal with more specialized types of data: non-ASCII character sets, image file contents, and so on. For those programmers, and others who may someday join them, in this chapter weâre going to fill in the rest of the Python string story and look at some more advanced concepts in Pythonâs string model.
Specifically, weâll explore the basics of Pythonâs support for Unicode textârich character strings used in internationalized applicationsâas well as binary dataâstrings that represent absolute byte values. As weâll see, the advanced string representation story has diverged in recent versions of Python:
Python 3.X provides ...
Get Learning Python, 5th Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.