O'Reilly logo

Apache Solr for Indexing Data by Anshul Johri, Sachin Handiekar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Configuring Solr with Nutch

Apache Solr can easily be configured for use with Nutch. We can perform the following steps to integrate Apache Nutch with Solr:

  1. Create a new core (nutch-example) in Solr by copying the nutch-example folder from the Chapter 7 code that comes with this book.
  2. After creating the new core, we just need to restart the Solr instance.
  3. After we have restarted the Solr instance, let's crawl some data using Nutch and index it into Solr. To do this, we'll navigate to the %NUTCH_HOME% folder and execute the following command:
    $ bin/crawl
    

    After executing the command, we'll see the following output:

    Usage: crawl [-i|--index] [-D "key=value"] <Seed Dir> <Crawl Dir> <Num Rounds>
     -i|--index Indexes crawl results into a configured indexer ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required