O'Reilly logo

Apache Solr 3 Enterprise Search Server by Eric Pugh, David Smiley

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The Clustering component

The Clustering component is a Solr contrib module that provides an extension point to integrate a clustering engine. Clustering is a technology that groups documents into similar clusters, using sophisticated statistical techniques. Each cluster is identified by a few words that were used to distinguish the documents in that cluster from the other clusters. As with the MoreLikeThis component which also uses statistical techniques, the quality of the results is hit or miss.

Tip

The primary means of navigation / discovery of your data should generally be search and faceting. For so-called un-structured text use cases, there are, by definition, few attributes to facet on. Clustering search results and presenting tag-clouds ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required