Processing Data Files Sequentially
By now your lib/parse-rdf.js is a robust module that can reliably convert RDF content into JSON documents. All that remains is to walk through the Project Gutenberg catalog directory and collect all the JSON documents.
More concretely, we need to do the following:
- Traverse down the data/cache/epub directory looking for files ending in rdf.
- Read each RDF file.
- Run the RDF content through parseRDF.
- Collect the JSON serialized objects into a single, bulk file for insertion.
The NoSQL database we’ll be using is Elasticsearch, a document datastore that indexes JSON objects. Soon, in Chapter 6, Commanding Databases, we’ll dive deep into Elasticsearch and how to effectively use it with Node.js. You’ll learn how to ...