Chapter 8. Strings and Things

v3 supplies Unicode text strings as type str, with operators, built-in functions, methods, and dedicated modules. It also supplies the somewhat similar bytes type, representing arbitrary binary data as a sequence of bytes, also known as a bytestring or byte string. This is a major difference from v2, where type str is a sequence of bytes, while Unicode text strings are of type unicode. Many textual operations, in both versions, are possible on objects of either type.

This chapter covers the methods of string objects, in “Methods of String and Bytes Objects”; string formatting, in “String Formatting”; and the modules string (in “The string Module”) and pprint (in “The pprint Module”). Issues related specifically to Unicode are also covered, in “Unicode”. The new (v3.6) formatted string literals are covered in “New in 3.6: Formatted String Literals”.

Methods of String and Bytes Objects

Unicode str and bytes objects are immutable sequences, as covered in “Strings”. All immutable-sequence operations (repetition, concatenation, indexing, and slicing) apply to them, returning an object of the same type. A string or bytes object s also supplies several nonmutating methods, as documented in Table 8-1.

Unless otherwise noted, methods are present on objects of either type. In v3, str methods return a Unicode string, while methods of bytes objects return a bytestring (in v2, type unicode stands for a textual string—i.e., Unicode—and type str for a bytestring). ...

Get Python in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.