Choosing activation functions for multilayer networks
For simplicity, we have only discussed the sigmoid activation function in the context of multilayer feedforward neural networks so far; we used it in the hidden layer as well as the output layer in the multilayer perceptron implementation in Chapter 12, Implementing a Multilayer Artifiial Neural Network from Scratch.
Although we referred to this activation function as a sigmoid function—as it is commonly called in literature—the more precise definition would be a logistic function or negative log-likelihood function. In the following subsections, you will learn more about alternative sigmoidal functions that are useful for implementing multilayer neural networks.
Technically, we can use any function ...