In previous sections, we discussed the input, the network architecture, and how we plan to decode the hidden states to predict the output vectors. Now, we will define the object function that we will be using to train our developed model.
In our case, cross-entropy loss works well as a loss for the case that we are dealing with. We will be using the second method to decode the output states, using a CRF. However, we will rely on TensorFlow to provide an easy-to-use function to implement such a complex concept, as follows:
# Define the labels as a placeholder with shape = (batch size, sentences) labels = tf.placeholder(tf.int32, shape=[None, None], name="labels")log_likelihood, transition_params = tf.contrib.crf.crf_log_likelihood( ...