Clustering the Headlines dataset

Let's now look at clustering the Headlines dataset:

  1. We will import the required functions:

  1. We also want to construct silhouette plots, so we need to compute the Jaccard similarity. For that, we will use the following lines of code:

This results in the following output:

  1. Now, let's go ahead and perform this clustering, using the following function:
  1. We will then print out the clusters, using the following ...

Get Training Systems Using Python Statistical Modeling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.