January 2018
Beginner to intermediate
284 pages
8h 35m
English
The loss function is the negative logarithm of the output of the softmax:
Remember that the total loss for the dataset is the mean of Li overall training examples together with a regularization term, R(W).
From an information theory point of view, this is essentially a cross-entropy loss function.
We can understand it in the following way.
The cross-entropy is defined as:
With the actual distribution
as a delta function, its ...
Read now
Unlock full access