Credit: Paul Moore, Raymond Hettinger
You have a list of data grouped by a key value, typically read from a spreadsheet or the like, and want to generate a summary of that information for reporting purposes.
function introduced in Python 2.4 helps with this task:
from itertools import groupby from operator import itemgetter def summary(data, key=itemgetter(0), field=itemgetter(1)): """ Summarise the given data (a sequence of rows), grouped by the given key (default: the first item of each row), giving totals of the given field (default: the second item of each row). The key and field arguments should be functions which, given a data record, return the relevant value. """ for k, group in groupby(data, key): yield k, sum(field(row) for row in group) if _ _name_ _ == "_ _main_ _": # Example: given a sequence of sales data for city within region, # _sorted on region_, produce a sales report by region sales = [('Scotland', 'Edinburgh', 20000), ('Scotland', 'Glasgow', 12500), ('Wales', 'Cardiff', 29700), ('Wales', 'Bangor', 12800), ('England', 'London', 90000), ('England', 'Manchester', 45600), ('England', 'Liverpool', 29700)] for region, total in summary(sales, field=itemgetter(2)): print "%10s: %d" % (region, total)
In many situations, data is available in tabular form, with the information naturally grouped by a subset of the data values (e.g., recordsets obtained from database queries ...