7Activation Function

In mathematical physics, quantum field theory and statistical mechanics are characterized by the probability distribution of exp(−βH(x)) where H(x) is a Hamiltonian function20 It is well known in that physical problems are determined by the algebraic structure of H(x). Statistical learning theory can be understood as mathematical physics where the Hamiltonian is a random process defined by the log likelihood ratio function

Sumio Watanabe, Algebraic Geometry and Statistical Learning Theory

In Chapter 6, we have established the importance of normalizing the input and output data of a neural network and the need to find a nonlinear operation that causes hidden layers to be sometimes correlated with an input and sometimes not. We have also made it clear that the network's computational algorithms work better if they are processing numbers between 0 and 1. Nevertheless, if we only apply common normalization and summation rules, the optimization will not be defined in all the available mathematical links between the network's neurons.

In order to ensure that the neural network represents a valid mathematical model, we need to introduce some specific functions – activation functions. We will take Particle Physics as an analogy for this. Referring to Fermi statistics, the estimation problem was created to deal with the approximation of some physical quantities that are impossible to calculate otherwise due to their high order of magnitude, especially when singularity ...

Get Systems Engineering Neural Networks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.