Suppose that we would like to classify into five classes; we would compute a five-dimensional vector, . This five-dimensional vector could be interpreted as the probability of representing each class. What this means is that the *i*-th component of *s* provides the probability, or the score, for the *i* class*,* given the word, *w*.

This dense vector can be computed in TensorFlow as follows:

# Weights matrix initialised with the default initialiserW = tf.get_variable("W", shape=[2 * hidden_state_size, ntags], dtype=tf.float32)# Bias vector initialised using a zero initialiserb = tf.get_variable("b", shape=[ntags], dtype=tf.float32, ...