Time for action - handling compressed files

In this example, we'll add support for common compression formats using Python's standard library.

  1. Using the code in logscan-c.py as your starting point, create logscan-d.py. Add a new function just below the MaxSizeHandler class.
    def get_stream(path):
    	Detect compression.
    	If the file name ends in a compression
    	suffix, we'll open it using the correct
    	algorith. If not, we just return a standard
    	file object.
    	_open = open
    	if path.endswith('.gz'):
    		_open = gzip.open
    	elif path.endswith('.bz2'):
    		_open = bz2.open
    	return _open(path)
  2. Within our main section, update the line that reads open(opts.file) to read get_stream(opts.file)..
  3. At the top of the listing, ensure that you're importing the two new compression ...

Get Python 2.6 Text Processing Beginner's Guide now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.