Inference
In this section, we are going to consider both our experimental setups: one where the generator and discriminator predict and discriminate sequences of words, and another where the models predict and discriminate sequences on characters. Note that in both cases, there is no difference between the representation of a word or a character; they are just vectors in multidimensional space.
Assuming the same sequence length, the task of predicting a sequence of characters is harder than the task of predicting a sequence of words. First, because in the character case the model has to perform more predictions. Second, because overall entropy or uncertainty when predicting characters is higher than predicting words, as it implies predicting ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access