Everything should be made as simple as possible, but not simpler.
—Albert Einstein
The main purpose of machine learning is to learn the real model of the data from the training set, so that it can perform well on the unseen test set. We call this the generalization ability. Generally speaking, the training set and the test set are sampled from the same data distribution. The sampled samples are independent of each other, but come from the same distribution. ...