Skip to Content
Python Cookbook
book

Python Cookbook

by Alex Martelli, David Ascher
July 2002
Intermediate to advanced
608 pages
15h 46m
English
O'Reilly Media, Inc.
Content preview from Python Cookbook

Accessing Substrings

Credit: Alex Martelli

Problem

You want to access portions of a string. For example, you’ve read a fixed-width record and want to extract the record’s fields.

Solution

Slicing is great, of course, but it only does one field at a time:

afield = theline[3:8]

If you need to think in terms of field length, struct.unpack may be appropriate. Here’s an example of getting a five-byte string, skipping three bytes, getting two eight-byte strings, and then getting the rest:

import struct

# Get a 5-byte string, skip 3, get two 8-byte strings, then all the rest:
baseformat = "5s 3x 8s 8s"
numremain = len(theline)-struct.calcsize(baseformat)
format = "%s %ds" % (baseformat, numremain)
leading, s1, s2, trailing = struct.unpack(format, theline)

If you need to split at five-byte boundaries, here’s how you could do it:

numfives, therest = divmod(len(theline), 5)
form5 = "%s %dx" % ("5s "*numfives, therest)
fivers = struct.unpack(form5, theline)

Chopping a string into individual characters is of course easier:

chars = list(theline)

If you prefer to think of your data as being cut up at specific columns, slicing within list comprehensions may be handier:

cuts = [8,14,20,26,30]
pieces = [ theline[i:j] for i, j in zip([0]+cuts, cuts+[sys.maxint]) ]

Discussion

This recipe was inspired by Recipe 1.1 in the Perl Cookbook. Python’s slicing takes the place of Perl’s substr. Perl’s built-in unpack and Python’s struct.unpack are similar. Perl’s is slightly handier, as it accepts a field length of ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Modern Python Cookbook - Second Edition

Modern Python Cookbook - Second Edition

Steven F. Lott
Python Cookbook, 3rd Edition

Python Cookbook, 3rd Edition

David Beazley, Brian K. Jones

Publisher Resources

ISBN: 0596001673Supplemental ContentCatalog PageErrata