February 2018
Intermediate to advanced
262 pages
6h 59m
English
It is best practice to split the data into three parts—training, validation, and test datasets. The best approach for using the holdout dataset is to:
Avoid splitting the data into two parts, as it may lead to an information leak. Training and testing it on the same dataset is a clear no-no as it does not guarantee algorithm generalization. There are three popular holdout strategies that can be used to split the data into training ...