O'Reilly logo

Hadoop MapReduce v2 Cookbook - Second Edition by Thilina Gunarathne

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Elasticsearch for indexing and searching

Elasticsearch (http://www.elasticsearch.org/) is an Apache 2.0 licensed open source search solution built on top of Apache Lucene. Elasticsearch is a distributed, multi-tenant, and document-oriented search engine. Elasticsearch supports distributed deployments, by breaking down an index into shards and by distributing the shards across the nodes in the cluster. While both Elasticsearch and Apache Solr use Apache Lucene as the core search engine, Elasticsearch aims to provide a more scalable and a distributed solution that is better suited for the cloud environments than Apache Solr.

Getting ready

Install Apache Nutch and crawl some web pages as per the Intradomain web crawling using Apache Nutch or Whole ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required