O'Reilly logo

Hibernate Search by Example by Steve Perkins

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Analysis

When a field is indexed by Lucene, it undergoes a parsing and conversion process called analysis. In Chapter 3, Performing Queries, we mentioned that the default analyzer tokenizes string fields, and that this behavior should be disabled if you plan to sort on that field.

However, much more is possible during analysis. Apache Solr components may be assembled in hundreds of combinations. They can manipulate text in various ways during indexing, and open the door to some really powerful search functionally.

In order to discuss the Solr components that are available, or how to assemble them into a custom analyzer definition, we must first understand the three phases of Lucene analysis:

  • Character filtering
  • Tokenization
  • Token filtering

Analysis ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required