Chapter 8. Memory Augmented Neural Networks

So far we’ve seen how effective an RNN can be at solving a complex problem like machine translation. However, we’re still far from reaching its full potential! In Chapter 7 we mentioned that it’s theoretically proven that the RNN architecture is a universal functional representer; a more precise statement of the same result is that RNNs are Turing complete. This simply means that given proper wiring and adequate parameters, an RNN can learn to solve any computable problem, which is basically any problem that can be solved by a computer algorithm or, equivalently, a Turing machine.

Neural Turing Machines

Though theoretically possible, it’s extremely difficult to achieve that kind of universality in practice! This difficulty stems from the fact that we’re looking at an immensely huge search space of possible wirings and parameter values of RNNs, a space so vastly large for gradient descent to find an appropriate solution for any arbitrary problem. However, in the remaining sections of this chapter we’ll start exploring some approaches at the edge of research that would allow us to start tapping into that potential!

Let’s think for a while about a very simple reading comprehension question like the following:

Mary travelled to the hallway. She grabbed the milk glass there.
Then she travelled to the office, where she found an apple
and grabbed it.

How many objects is Mary carrying?

The answer is so trivial: it’s two! But ...

Get Fundamentals of Deep Learning now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.