A valid method to detect the problem of wrongly selected test sets is provided by the cross-validation technique. In particular, we're going to use the K-Fold cross-validation approach. The idea is to split the whole dataset X into a moving test set and a training set (the remaining part). The size of the test set is determined by the number of folds so that, during k iterations, the test set covers the whole original dataset.
In the following diagram, we see a schematic representation of the process:
In this way, it's possible to assess the accuracy of the model using different sampling splits, ...