Analysis

First, we will perform rescaling, since the highest frequency count of the word money is 0.08 percent, whereas the highest frequency count of the word "god(s)" is 1.72%. So, we will divide the frequency count of money by 0.08, and the frequency count of god(s) by 1.72, as follows:

Book number Money scaled God(s) scaled
1 0 0.0406976744
2 0 0.0988372093
3 0.125 0.0581395349
4 0 0.1860465116
5 0 0.0348837209
6 0 0.1569767442
7 0 0.0348837209
8 0.25 0.3430232558
9 0.25 0.261627907
10 0.125 0.4011627907
11 0.125 1
12 0.625 0.0058139535
13 1 0
14 0.5 0.0058139535
15 0.375 0.0174418605
16 0.5 0.0174418605
17 0.75 0.0174418605

Now that we have rescaled the data, let's apply the k-means clustering ...

Get Data Science Algorithms in a Week - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.