The training step

In previous sections, we discussed the input, the network architecture, and how we plan to decode the hidden states to predict the output vectors. Now, we will define the object function that we will be using to train our developed model.

In our case, cross-entropy loss works well as a loss for the case that we are dealing with. We will be using the second method to decode the output states, using a CRF. However, we will rely on TensorFlow to provide an easy-to-use function to implement such a complex concept, as follows:

# Define the labels as a placeholder with shape = (batch size, sentences) labels = tf.placeholder(tf.int32, shape=[None, None], name="labels")log_likelihood, transition_params = tf.contrib.crf.crf_log_likelihood( ...

Get Hands-On Natural Language Processing with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.