Importing Real Data

Next we’ll use the request module to import a whole bunch of data into the books database. As we do so, we’ll run into the limited-resource problem we first saw in The Limited-Resource Problem. This will give us a backdrop for experimenting with asynchronous coding techniques.

First we need to get some data to work with. We’ll use the catalog data from Project Gutenberg, a site dedicated to making public-domain works available as free ebooks.[20]

Downloading Project Gutenberg Data

Project Gutenberg produces catalog download bundles that contain Resource Description Framework (RDF) files for each of their 43,000-plus books. (RDF is an XML-based format.) The bz2 version of the catalog file is about 13 MB. Fully extracted, it ...

