Defining the LSTM
Now that we have defined the data generator to output a batch of data, starting with a batch of image feature vectors followed by the caption for the respective images word by word, we will define the LSTM cell. The definition of the LSTM and the training procedure is similar to what we observed in the previous chapter.
We will first define the parameters of the LSTM cell. Two sets of weights and a bias for input gate, forget gate, output gate, and for calculating the candidate value:
# Input gate (i_t) - How much memory to write to cell state # Connects the current input to the input gate ix = tf.Variable(tf.truncated_normal([embedding_size, num_nodes], stddev=0.01)) # Connects the previous hidden state to the input gate im = ...
Get Natural Language Processing with TensorFlow now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.