ReLU has become more popular in the recent years; we can find either its usage or one of its variants' usages in almost any modern architecture. It has a simple mathematical formulation:
f(x)=max(0,x)
In simple words, ReLU squashes any input that is negative to zero and leaves positive numbers as they are. We can visualize the ReLU function as follows:
Some of the pros and cons of using ReLU are as follows:
- It helps the optimizer in finding the right set of weights sooner. More technically it makes the convergence of ...