Chapter 11. Text Processing

There is a whole range of applications for which scripting languages like Python are perfectly suited; and in fact scripting languages were arguably invented specifically for these applications, which involve the simple search and processing of various files in the directory tree. Taken together, these applications are often called text processing. Python is a great scripting tool for both writing quick text processing scripts and then scaling them up into more generally useful code later, using its clean object-oriented coding style.

In this chapter you learn:

  • Some of the typical reasons you need text processing scripts

  • A few simple scripts for quick system administration tasks

  • How to navigate around in the directory structure in a platform-independent way, so your scripts will work fine on Linux, Windows, or even the Mac

  • How to create regular expressions to compare the files found by the os and os.path modules

  • How to use successive refinement to keep enhancing your Python scripts to winnow through the data found

Text processing scripts are one of the most useful tools in the toolbox of anybody who seriously works with computer systems, and Python is a great way to do text processing. You're going to like this chapter.

Why Text Processing Is So Useful

In general, the whole idea behind text processing is simply finding things. There are, of course, situations in which data are organized in a structured way; these are called databases and that's not what this chapter ...

Get Beginning Python®: Using Python 2.6 and Python 3.1 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.