String Module Utilities

Python’s string module includes a variety of text-processing utilities that go above and beyond string expression operators. For instance:

  • string.find performs substring searches.

  • string.atoi converts strings to integers.

  • string.strip removes leading and trailing whitespace.

  • string.upper converts to uppercase.

  • string.replace performs substring substitutions.

The Python library manual includes an exhaustive list of available tools. Moreover, as of Python 2.0, Unicode (wide) strings are fully supported by Python string tools, and most of the string module’s functions are also now available as string object methods. For instance, in Python 2.0, the following two expressions are equivalent:

string.find(str, substr)      # traditional
str.find(substr)              # new in 2.0

except that the second form does not require callers to import the string module first. As usual, you should consult the library manuals and Appendix A, for late-breaking news on the string tools front.

In terms of this chapter’s main focus, though, Python’s built-in tools for splitting and joining strings around tokens turn out to be especially useful when it comes to parsing text:

string.split

Splits a string into substrings, using either whitespace (tabs, spaces, newlines) or an explicitly passed string as a delimiter.

string.join

Concatenates a list or tuple of substrings, adding a space or an explicitly passed separator string between each.

As we saw earlier in this book, split chops a string ...

Get Programming Python, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.