Just as in the first chapter, we will have to scale the data, since the income axis is significantly greater and thus would diminish the impact of the age axis, which actually has a good predictive power in this kind of problem. This is because it is expected that older people have had more time to settle down, save money, and buy a house, as compared to younger people.
We apply the same rescaling from Chapter 1, Classification Using K Nearest Neighbors, and obtain the following table:
Age | Scaled age | Annual income in USD | Scaled annual income | House ownership status |
23 | 0.09375 | 50000 | 0.2 | non-owner |
37 | 0.53125 | 34000 | 0.04 | non-owner |
48 | 0.875 | 40000 | 0.1 | owner |
52 | 1 | 30000 | 0 | non-owner |
28 | 0.25 | 95000 | 0.65 | owner |
25 | 0.15625 | 78000 ... |