First, we will perform rescaling, since the highest frequency count of the word money is 0.08 percent, whereas the highest frequency count of the word "god(s)" is 1.72%. So, we will divide the frequency count of money by 0.08, and the frequency count of god(s) by 1.72, as follows:
Book number | Money scaled | God(s) scaled |
1 | 0 | 0.0406976744 |
2 | 0 | 0.0988372093 |
3 | 0.125 | 0.0581395349 |
4 | 0 | 0.1860465116 |
5 | 0 | 0.0348837209 |
6 | 0 | 0.1569767442 |
7 | 0 | 0.0348837209 |
8 | 0.25 | 0.3430232558 |
9 | 0.25 | 0.261627907 |
10 | 0.125 | 0.4011627907 |
11 | 0.125 | 1 |
12 | 0.625 | 0.0058139535 |
13 | 1 | 0 |
14 | 0.5 | 0.0058139535 |
15 | 0.375 | 0.0174418605 |
16 | 0.5 | 0.0174418605 |
17 | 0.75 | 0.0174418605 |
Now that we have rescaled the data, let's apply the k-means clustering ...