Since the concept of the dataset is essential in ML, let's look at it in detail, with a focus on how to create the required splits for building a complete and correct ML pipeline.
A dataset is nothing more than a collection of data. Formally, we can describe a dataset as a set of pairs, , where is the i-th example and is its label, with a finite cardinality, :
A dataset has a finite number of elements, and our ...