10

Clustering Speech-to-Text Transcriptions

When dealing with real-world datasets, the most common situation is that they come unlabeled—manually labeling each sample is often unrealistic in terms of time and cost. Therefore, it is imperative to incorporate methods that can handle datasets of this type. Unsupervised learning algorithms are applicable in this case and, in this chapter, we deal with a particular kind for grouping similar data under the same category. Expressly, we incorporate clustering methods that allow the transformation of raw data into possible actionable insights, for instance, identifying the general theme in each cluster.

While the previous chapters focused mainly on supervised learning techniques, we dedicate the current ...

Get Machine Learning Techniques for Text now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.