December 2018
Beginner to intermediate
684 pages
21h 9m
English
The unrolled computational graph in the previous diagram shows that each transformation, consisting of a linear matrix operation followed by a non-linear transformation, is shallow in the sense that it could be represented by a single network layer.
Just as added depth resulted in more useful hierarchical representations learned by FFNNs, and in particular CNNs, RNNs can also benefit from decomposing the input-output mapping into multiple layers. In the RNN context, this mapping consists of transformations between the input and hidden state, between hidden states, and from the hidden state to the output.
A common approach that we will employ next is to stack recurrent layers on top of each other. As a result, these ...