Apache Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene. Solr is a standalone enterprise search server with a REST-like API. You put documents in it (called "indexing") via JSON, XML, CSV or binary over HTTP. You query it via HTTP GET and receive JSON, XML, CSV or binary results.
The features at http://lucene.apache.org/solr/features.html are listed here, making it an ideal choice for the capability that we are looking for in our Data Lake implementation:
- Advanced and optimized full-text search: Powered by Lucene's advanced matching and searching capability
- Capable of handling high-volume traffic
- Standards based open interfaces: XML, JSON and HTTP: ...