The goal of the training algorithm is to find the weights and biases of the network that minimize a certain loss function, which depends on the prediction output and the true labels or values. To accomplish this, the gradients of the loss function, with respect to the weights and biases, are computed at the output, and the errors are propagated backward, up to the input layer. These propagated errors are, in turn, used to compute the gradients of all of the intermediate layers, up to the input layer. This technique of computing gradients is called **backpropagation**. During each iteration of the process, the current error in the output prediction is propagated backward through the network, to compute gradients with respect to ...