Activation functions are used to learn non-linear and complex functional mappings between the inputs and the response variable in an artificial neural network. One thing to keep in mind is that an activation function should be differentiable so that backpropagation optimization can be performed in the network while computing gradients of error (loss) with respect to weights, in order to optimize weights and reduce errors. Let's have a look at some of the popular activation functions and their properties:
- Sigmoid:
- A sigmoid function ranges between 0 and 1.
- It is usually used in an output layer of a binary classification problem.
- It is better than linear activation because the output of the activation function is in the range ...