For the purposes of this section, we'll use a NN with a single regression output and mean square error (MSE) cost function, which is defined as follows:
Here, we have the following:
- fθ(x(i)) is the output of the NN.
- n is the total number of samples in the training set.
- x(i) are the vectors for the training samples, where the superscript i indicates the i-th sample of the dataset. We use superscript because x(i) is a vector and the subscript is reserved for each of the vector components. For example, is the j-th component of ...