Skip to Main Content
Solr in Action
book

Solr in Action

by Trey Grainger, Timothy Potter
March 2014
Intermediate to advanced content levelIntermediate to advanced
664 pages
21h 15m
English
Manning Publications
Content preview from Solr in Action

Appendix C. Useful data import configurations

As discussed in chapter 12, the Data Import Handler provides the ability for Solr to pull in datasets from many kinds of external sources. In chapter 10, we used the DIH to transform Wikipedia pages from a partial Wikipedia data dump file into Solr documents and index them. This appendix will provide more detail into how the DIH was configured to enable this import, and we’ll demonstrate how to import both the full Wikipedia dataset and also another large dataset useful for experimentation: a data dump from Stack Exchange.

C.1. Indexing Wikipedia

In chapter 12, we imported a subset of articles from Wikipedia into a preconfigured Solr core named solrpedia. In order to enable the DIH, several steps ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Lucene in Action, Second Edition

Lucene in Action, Second Edition

Erik Hatcher, Michael McCandless, Otis Gospodnetic
Elasticsearch: The Definitive Guide

Elasticsearch: The Definitive Guide

Clinton Gormley, Zachary Tong

Publisher Resources

ISBN: 9781617291029Supplemental ContentPublisher SupportOtherPublisher WebsiteSupplemental ContentPurchase Link