David C. Anastasiu
University of MinnesotaMinneapolis, MNanast021@umn.edu
University of CalabriaArcavacata di Rende, Italytagarelli@deis.unical.it
University of MinnesotaMinneapolis, MNkarypis@cs.umn.edu
The proliferation of documents, on both the Web and in private systems, makes knowledge discovery in document collections arduous. Clustering has been long recognized as a useful tool for the task. It groups like-items together, maximizing intra-cluster similarity and inter-cluster distance. Clustering can provide insight into the make-up of a document collection and is often used as the initial step in data analysis.
While most document clustering ...