O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Solr 1.4 Enterprise Search Server

Book Description

Enhance your search with faceted navigation, result highlighting, fuzzy queries, ranked scoring, and more

  • Deploy, embed, and integrate Solr with a host of programming languages

  • Implement faceting in e-commerce and other sites to summarize and navigate the results of a text search

  • Enhance your search by highlighting search results, offering spell-corrections, auto-suggest, finding “similar” records, boosting records and fields for scoring, phonetic matching

  • Informative and practical approach to development with fully working examples of integrating a variety of technologies

  • Written and tested for Solr 1.4 pre-release 2009.08

In Detail

If you are a developer building a high-traffic web site, you need to have a terrific search engine. Sites like Netflix.com and Zappos.com employ Solr, an open source enterprise search server, which uses and extends the Lucene search library. This is the first book in the market on Solr and it will show you how to optimize your web site for high volume web traffic with full-text search capabilities along with loads of customization options. So, let your users gain a terrific search experience.

This book is a comprehensive reference guide for every feature Solr has to offer. It serves the reader right from initiation to development to deployment. It also comes with complete running examples to demonstrate its use and show how to integrate it with other languages and frameworks.

This book first gives you a quick overview of Solr, and then gradually takes you from basic to advanced features that enhance your search. It starts off by discussing Solr and helping you understand how it fits into your architecture—where all databases and document/web crawlers fall short, and Solr shines. The main part of the book is a thorough exploration of nearly every feature that Solr offers. To keep this interesting and realistic, we use a large open source set of metadata about artists, releases, and tracks courtesy of the MusicBrainz.org project. Using this data as a testing ground for Solr, you will learn how to import this data in various ways from CSV to XML to database access. You will then learn how to search this data in a myriad of ways, including Solr's rich query syntax, "boosting" match scores based on record data and other means, about searching across multiple fields with different boosts, getting facets on the results, auto-complete user queries, spell-correcting searches, highlighting queried text in search results, and so on.

After this thorough tour, we'll demonstrate working examples of integrating a variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby, XSLT, PHP, and Python.

Finally, we'll cover various deployment considerations to include indexing strategies and performance-oriented configuration that will enable you to scale Solr to meet the needs of a high-volume site.

Table of Contents

  1. Copyright
  2. Credits
  3. About the Authors
  4. About the Reviewers
  5. Preface
    1. What this book covers
    2. Who this book is for
    3. Conventions
    4. Reader feedback
    5. Customer support
  6. Quick Starting Solr
    1. An introduction to Solr
    2. Comparison to database technology
    3. Getting started
    4. A quick tour of Solr!
    5. The schema and configuration files
    6. Solr resources outside this book
    7. Summary
  7. Schema and Text Analysis
    1. MusicBrainz.org
    2. One combined index or multiple indices
    3. Schema design
    4. The schema.xml file
    5. Text analysis
    6. Summary
  8. Indexing Data
    1. Communicating with Solr
    2. Using curl to interact with Solr
    3. Remote streaming
    4. Sending XML to Solr
    5. Sending CSV to Solr
    6. Direct database and XML import
    7. Indexing documents with Solr Cell
    8. Summary
  9. Basic Searching
    1. Your first search, a walk-through
    2. Solr's generic XML structured data representation
    3. Solr's XML response format
    4. Query parameters
    5. Query syntax
    6. Filtering
    7. Sorting
    8. Request handlers
    9. Scoring
    10. Summary
  10. Enhanced Searching
    1. Function queries
    2. Dismax Solr request handler
    3. Faceting
    4. Summary
  11. Search Components
    1. About components
    2. The highlighting component
    3. Query elevation
    4. Spell checking
    5. The more-like-this search component
    6. Stats component
    7. Field collapsing
    8. Other components
    9. Summary
  12. Deployment
    1. Implementation methodology
    2. Installing into a Servlet container
    3. Logging
    4. A SearchHandler per search interface
    5. Solr cores
    6. JMX
    7. Securing Solr
    8. Summary
  13. Integrating Solr
    1. Structure of included examples
    2. SolrJ: Simple Java interface
    3. Using JavaScript to integrate Solr
    4. Accessing Solr from PHP applications
    5. Ruby on Rails integrations
    6. Summary
  14. Scaling Solr
    1. Tuning complex systems
    2. Optimizing a single Solr server (Scale High)
    3. Moving to multiple Solr servers (Scale Wide)
    4. Combining replication and sharding (Scale Deep)
    5. Summary
  15. Index