February 2018
Intermediate to advanced
262 pages
6h 59m
English
Duplicates are common in data. Care should be taken so that the data present in the training, validation, and test sets are unique. If there are duplicates, then the model may not generalize well on unseen data.