First, we will understand how forward propagation works in the skip-gram model. Let's use the same notations we used in the CBOW model. The architecture of the skip-gram model is shown in the following figure. As you can see, we feed only one target word as an input and it returns the context words as an output :
Similar to what we saw in CBOW, in the Forward propagation section, first we multiply our input ...