In this chapter we’ll look at using sequence-to-sequence networks to learn transformations between pieces of text. This is a relatively new technique with tantalizing possibilities. Google claims to have made huge improvements to its Google Translate product using this technique; moreover, it has open sourced a version that can learn language translations purely based on parallel texts.
We won’t go that far to start with. Instead, we’ll start out with a simple model that learns the rules for pluralization in English. After that we’ll extract dialogue from 19th-century novels from Project Gutenberg and train a chatbot on them. For this last project we’ll have to abandon the safety of Keras running in a notebook and will use Google’s open source seq2seq toolkit.
The following notebooks contain the code relevant for this chapter:
08.1 Sequence to sequence mapping 08.2 Import Gutenberg 08.3 Subword tokenizing
How do you train a model to reverse engineer a transformation?
Use a sequence-to-sequence mapper.
In Chapter 5 we saw how we can use recurrent networks to “learn” the rules of a sequence. The model learns how to best represent a sequence such that it can predict what the next element will be. Sequence-to-sequence mapping builds on this, but now the model learns to predict a different sequence based on the first one.
We can use this to learn all kinds of transformations. ...