Here is a table summarizing the different loss functions that we have described:
Loss function |
Use |
Benefits |
Disadvantages |
L2 |
Regression |
More stable |
Less robust |
L1 |
Regression |
More robust |
Less stable |
Pseudo-Huber |
Regression |
More robust and stable |
One more parameter |
Hinge |
Classification |
Creates a max margin for use in SVM |
Unbounded loss affected by outliers |
Cross-entropy |
Classification |
More stable |
Unbounded loss, less robust |
The remaining classification loss functions all have to do with the type of cross-entropy loss. The cross-entropy sigmoid los function is for use on unscaled logits and is preferred over computing the sigmoid and then the cross-entropy, because ...