There are several ways of controlling the training of neural networks to prevent overfitting in the training phase, for example, L2/L1 regularization, max-norm constraints, and dropout:
- L2 regularization: This is probably the most common form of regularization. Using the gradient descent parameter update, L2 regularization signifies that every weight will be decayed linearly towards zero.
- L1 regularization: For each weight w, we add the term λ∣w∣ to the objective. However, it is also possible to combine L1 and L2 regularization to achieve elastic net regularization.
- Max-norm constraints: Used to enforce an absolute upper bound on the magnitude of the weight vector for each hidden layer neuron. Projected gradient descent can ...