You can consider that each piece of text and each label is actually a vector in space and the coordinates of that vector are what we are actually trying to tweak and train so that the vector for a text and associated label are really close in space:
So, in this example, which is an example shown in 2D space, you have texts that are saying things such as "Nigerian Tommy Thompson is also a relative newcomer to the wrestling scene" and "James scored 20 of his 46 points in the opening quarter" are closer to the "sports" label and not the "travel" label.
The way we can do this is we can take the ...