Chapter 6. Clustering text

In this chapter

  • Basic concepts behind common text clustering algorithms
  • Examples of how clustering can help improve text applications
  • How to cluster words to identify topics of interest
  • Clustering whole document collections using Apache Mahout and clustering search results using Carrot2

How often have you browsed through content online and clicked through on an article that had an interesting title, but the underlying story was basically the same as the one you just finished? Or perhaps you’re tasked with briefing your boss on the day’s news but don’t have the time to wade through all the content involved when all you need is a summary and a few key points. Alternatively, maybe your users routinely enter ambiguous ...

Get Taming Text now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.