The key to the read and write operations is the weight vector, which indicates which rows to read from/write to. The controller produces this weight vector in four stages. Each stage produces an intermediate vector that gets passed to the next stage:

  • The first stage is content-based addressing, the goal of which is to generate a weight vector based on how similar each row is to the given key vector, kt, of length C. More precisely, the controller emits vector kt that is compared to each row of Mt using a cosine similarity measure, defined as follows:

The content weight vector is not normalized yet, so it is normalized with the ...

Get Hands-On One-shot Learning with Python now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.