Deep architectures
MLPs are powerful, but their expressiveness is limited by the number and the nature of the layers. Deep learning architectures, on the other side, are based on a sequence of heterogeneous layers which perform different operations organized in a computational graph. The output of a layer, correctly reshaped, is fed into the following one, until the output, which is normally associated with a loss function to optimize. The most interesting applications have been possible thanks to this stacking strategy, where the number of variable elements (weights and biases) can easily reach over 10 million; therefore, the ability to capture small details and generalize them exceeds any expectations. In the following section, I'm going ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access