Now that we understood how the CBOW model works with a single word as a context, we will see how it will work when you have multiple words as context words. The architecture of CBOW with multiple input words as a context is shown in the following figure:
There is not much difference between the multiple words as a context and a single word as a context. The difference is that, with multiple contexts words as inputs, we take the average of all the input context words. That is, as a first step, we forward propagate the network and compute the value of by multiplying input and weights , as we saw in the CBOW ...