The universal approximation theorem was first proved in 1989 for a NN with sigmoid activation functions and then in 1991 for NNs with arbitrary non-linear activation functions. It states that any continuous function on compact subsets of can be approximated to an arbitrary degree of accuracy by a feedforward NN with at least one hidden layer with a finite number of units and a non-linear activation. Although a NN with a single hidden layer won't perform well in many tasks, the theorem still tells us that there are no theoretical insurmountable limitations in terms of NNs. The formal proof of the theorem ...
The universal approximation theorem
Get Advanced Deep Learning with Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.