There's more...
A lot of popular models that have been proposed for neural machine translation across various domains belong to the encoder-decoder architecture family. However, this architecture restricts the encoder to encoding the input sequence to a fixed-length representation, which results in deteriorated performance for lengthy input sequences. One of the ways to overcome this bottleneck in performance is to use the attention mechanism within the sequences, which makes the network learn to pay selective attention to the inputs that are relevant for predicting a target output. Most importantly, attention allows the network to encode the input sequence into a sequence of vectors and choose among these vectors while decoding, thus freeing ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access