O'Reilly logo

Python Programming On Win32 by Mark Hammond, Andy Robinson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Geometric Operations

Now that we have the data, what to do with it depends on the operation taking place. An approach that has stood the test of time is to keep adding operations to the Dataset class, building over time a veritable Swiss army knife. Common families of operations can include:

Field transformations

Applying functions to entire columns in order to format numbers and dates, switch encodings, or build database keys.

Row and column operations

Inserting, appending, and deleting whole columns, breaking into several separate datasets whenever a certain field changes, and sorting operations.

Filter operations

Extracting or dropping rows meeting user-defined criteria.

Geometric operations

Cross-tabulate, detabulate (see Figure 13.4), and transpose.

Storage operations

Load and save to native Python data (marshal, cPickle), delimited text files, and fixed-width text files.

Some of these operations are best understood diagrammatically. Consider the operation in Figure 13.4, which can’t be performed by SQL.

Detabulating and adding constant columns
Figure 13.4. Detabulating and adding constant columns

This operation was a mainstay of the case study that follows. Once the correct operations have been created, it can be reduced to a piece of Python code:

>>> ds1.pp() # presume we have the table above already ('Patient', 'X', 'Y', 'Z') ('Patient 1', 0.55, 0.08, 0.97) ('Patient 2', 0.54, 0.11, 0.07) ('Patient 3', 0.61, 0.08, 0.44) ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required