As Word2vec models are neural networks themselves, we train them just like a standard feedforward network with a loss function and stochastic gradient descent. During the training process, the algorithm scans over the input corpus and takes batches of it as input. After each batch, a loss is calculated. When optimizing, we want to minimize our loss as we would with a standard feedforward neural network.
Let's walk through how we would create and train a Word2vec model in TensorFlow:
- First, let's start with our imports. We'll use our standard tensorflow and numpy imports and the Python library itertools, as well as two utility functions from the machine learning package scikit-learn. The following code block shows ...