Improving LSTMs – beam search

As we saw earlier, the generated text can be improved. Now let's see if beam search, which we discussed in Chapter 7, Long Short-Term Memory Networks, might help to improve the performance. In beam search, we will look ahead a number of steps (called a beam) and get the beam (that is, a sequence of bigrams) that has the highest joint probability calculated separately for each beam. The joint probability is calculated by multiplying the prediction probabilities of each predicted bigram in a beam. Note that this is a greedy search, meaning that we will calculate the best candidates at each depth of the tree iteratively, as the tree grows. It should be noted that this search will not result in the globally best beam. ...

Get Natural Language Processing with TensorFlow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.