In this section, we will explain step-by-step how to build and train a Skip-Gram model using TensorFlow. For a detailed tutorial and source code, please refer to https://www.tensorflow.org/tutorials/word2vec:
- We can download the dataset from http://mattmahoney.net/dc/text8.zip.
- We read in the content of the file as a list of words.
- We set up the TensorFlow graph. We create placeholders for the input words and the context words, which are represented as integer indices to the vocabulary:
train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
Note that we train in batches, so batch_size refers to the size of the batch. We also create a ...