Let's say we have a neural network with an input layer, a hidden layer, and an output layer. The goal of the network is to predict a word given its surrounding words. The word that we are trying to predict is called the target word and the words surrounding the target word are called the context words.
How many context words do we use to predict the target word? We use a window of size to choose the context word. If the window size is 2, then we use two words before and two words after the target word as the context words.
Let's consider the sentence, The Sun rises in the east with the word rises as the target ...