Chapter 9. Strings and Things

Python’s str type implements Unicode text strings with operators, built-in functions, methods, and dedicated modules. The somewhat similar bytes type represents arbitrary binary data as a sequence of bytes, also known as a bytestring or byte string. Many textual operations are possible on objects of either type: since these types are immutable, methods mostly create and return a new string unless returning the subject string unchanged. A mutable sequence of bytes can be represented as a bytearray, briefly introduced in “bytearray objects”.

This chapter first covers the methods available on these three types, then discusses the string module and string formatting (including formatted string literals), followed by the textwrap, pprint, and reprlib modules. Issues related specifically to Unicode are covered at the end of the chapter.

Methods of String Objects

str, bytes, and bytearray objects are sequences, as covered in “Strings”; of these, only bytearray objects are mutable. All immutable-sequence operations (repetition, concatenation, indexing, and slicing) apply to instances of all three types, returning a new object of the same type. Unless otherwise specified in Table 9-1, methods are present on objects of all three types. Most methods of str, bytes, and bytearray objects return values of the same type, or are specifically intended to convert among representations.

Terms such as “letters,” “whitespace,” and so on refer to the corresponding attributes ...

Get Python in a Nutshell, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.