In general, when working with a supervised scenario, we define a non-negative error measure em which takes two arguments (expected and predicted output ) and allows us to compute a total error value over the whole dataset (made up of N samples):
This value is also implicitly dependent on the specific hypothesis H through the parameter set, and therefore optimizing the error implies finding an optimal hypothesis ...