ReLU has become more popular in the recent years; we can find either its usage or one of its variants' usages in almost any modern architecture. It has a simple mathematical formulation:


In simple words, ReLU squashes any input that is negative to zero and leaves positive numbers as they are. We can visualize the ReLU function as follows:

Image source:

Some of the pros and cons of using ReLU are as follows:

  • It helps the optimizer in finding the right set of weights sooner. More technically it makes the convergence of ...

