
Clean Up U.S. Addresses #81
Chapter 7, Names and Places
|
409
HACK
HACK
#81
Clean Up U.S. Addresses Hack #81
As all election officials know, identifying the same address when it is
formatted differently can be a tricky problem.
Many of the hacks in this book are meant to help us work with large vol-
umes of geographical information. Most of the data sets described in this
book are systematically gathered and organized by mapping, surveying, and
databasing professionals. This results in well-defined and well-formatted
data, ideal for programmatic processing (and hacking). However, other
interesting data sets are populated by humans running around in the world
at large who love to make typos, misspell words, enter data in the wrong
fields, and mangle information in every (un)imaginable way.
“Who Are the Neighbors Voting For?”
[Hack #16] was powered by just such a
messy database that is distributed, conveniently, by the United States Fed-
eral Election Commission. The FEC provides software for campaigns and
political committees to file their contribution records electronically, but this
software is fed by people contributing over the Web or by staffers entering
data from contributions collected at fundraising events or via paper mail.
These filings are archived and posted by the FEC on the Web at http://
herndon2.sdrdc.com/dcdev/. They contain the amount and date of each con-
tribution, as well as the name, ...