Chapter 6. Clustering text

In this chapter

  • Basic concepts behind common text clustering algorithms
  • Examples of how clustering can help improve text applications
  • How to cluster words to identify topics of interest
  • Clustering whole document collections using Apache Mahout and clustering search results using Carrot2

How often have you browsed through content online and clicked through on an article that had an interesting title, but the underlying story was basically the same as the one you just finished? Or perhaps you’re tasked with briefing your boss on the day’s news but don’t have the time to wade through all the content involved when all you need is a summary and a few key points. Alternatively, maybe your users routinely enter ambiguous ...

Get Taming Text: How to Find, Organize, and Manipulate It now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.