Luong attention

Luong attention (see Effective Approaches to Attention-based Neural Machine Translation at https://arxiv.org/abs/1508.04025) introduces several improvements over Bahdanau attention. Most notably, the alignment scores et depend on the decoder hidden state st, as opposed to st-1 in Bahdanau attention. To better understand this, let's compare the two algorithms:

Left: Bahdanau attention; right: Luong attention

Let's go through a step-by-step execution of Luong attention:

  1. Feed the encoder with the input sequence and compute the set of encoder hidden states .
  2. Compute the decoder hidden state based on the previous decoder hidden ...

Get Advanced Deep Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.