Preparing data for the NMT system

In this section, we will talk about the exact process for preparing data for training and predicting from the NMT system. First, we talk will about how to prepare training data (that is, the source sentence and target sentence pairs) to train the NMT system followed by inputting a given source sentence to produce the translation of the source sentence.

At training time

The training data consists of pairs of source sentences and corresponding translations to the target language. An example might look like this:

  • ( Ich ging nach Hause , I went home)
  • ( Sie hat in der Schule gewartet , She was waiting at school)

We have N such pairs in our dataset. If we are to implement a fairly good translator, N needs to be in the scale ...

Get Natural Language Processing with TensorFlow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.