Chapter 13

Clustering Words and Documents

Contents

Preamble

Clustering is arguably the oldest technology in text mining. Early uses of document clustering aided document retrieval systems by the military in World War II. Later, document search engines used document clustering as a preprocessing technique. Whether clustering is used to group documents in a corpus or words in a document, the technique is almost always used as a means to an end, rather than the end itself. Modern Internet search engines rely on document clustering techniques to aid in information retrieval. Therefore, we might consider clustering ...

Get Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.