Due to its simplicity, kNN does not have too many assumptions. However, there are some common pitfalls that you should be aware of as you apply kNN:
- kNN is evaluated lazily. By this, we mean that the distances or similarities are calculated when we need to make a prediction. There is not really anything to train or fit prior to making a prediction. This has some advantages, but the calculation and search over points can be slow when you have many data points.
- The choice of k is up to you, but you should put some formalism around choosing k and provide justification for the k that you choose. A common technique to choose k is just to search over a range of k values. You could, for example, start with k = 2. Then, ...