March 2014
Intermediate to advanced
664 pages
21h 15m
English
As discussed in chapter 12, the Data Import Handler provides the ability for Solr to pull in datasets from many kinds of external sources. In chapter 10, we used the DIH to transform Wikipedia pages from a partial Wikipedia data dump file into Solr documents and index them. This appendix will provide more detail into how the DIH was configured to enable this import, and we’ll demonstrate how to import both the full Wikipedia dataset and also another large dataset useful for experimentation: a data dump from Stack Exchange.
In chapter 12, we imported a subset of articles from Wikipedia into a preconfigured Solr core named solrpedia. In order to enable the DIH, several steps ...