"Building a deep RNN by stacking multiple recurrent hidden states on top of each other. This approach potentially allows the hidden state at each level to operate at different timescale" How to Construct Deep Recurrent Neural Networks (link: https://arxiv.org/abs/1312.6026), 2013
"RNNs are inherently deep in time, since their hidden state is a function of all previous hidden states. The question that inspired this paper was whether RNNs could also benefit from depth in space; that is from stacking multiple recurrent hidden layers on top of each other, just as feedforward layers are stacked in conventional deep networks". Speech Recognition With Deep RNNs (link:
https://arxiv.org/abs/1303.5778), 2013
Most researchers are using ...