Chapter 8. Strings and Things

v3 supplies Unicode text strings as type str, with operators, built-in functions, methods, and dedicated modules. It also supplies the somewhat similar bytes type, representing arbitrary binary data as a sequence of bytes, also known as a bytestring or byte string. This is a major difference from v2, where type str is a sequence of bytes, while Unicode text strings are of type unicode. Many textual operations, in both versions, are possible on objects of either type.

This chapter covers the methods of string objects, in “Methods of String and Bytes Objects”; string formatting, in “String Formatting”; and the modules string (in “The string Module”) and pprint (in “The pprint Module”). Issues related specifically to Unicode are also covered, in “Unicode”. The new (v3.6) formatted string literals are covered in “New in 3.6: Formatted String Literals”.

Methods of String and Bytes Objects

Unicode str and bytes objects are immutable sequences, as covered in “Strings”. All immutable-sequence operations (repetition, concatenation, indexing, and slicing) apply to them, returning an object of the same type. A string or bytes object s also supplies several nonmutating methods, as documented in Table 8-1.

Unless otherwise noted, methods are present on objects of either type. In v3, str methods return a Unicode string, while methods of bytes objects return a bytestring (in v2, type unicode stands for a textual string—i.e., Unicode—and type str for a bytestring). ...

Get Python in a Nutshell, 3rd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.