Our approach with NER-tagging is going to mirror our approach to POS-tagging; after all, they are very similar tasks, and both of them can be compared to the machine learning task of classification, where we assign an unknown object to the class it has the highest probability of belonging to.
Another similarity in our approaches to this task is the fact that we will be using spaCy to conduct our NER-tagging. Again, this does not mean that spaCy is the only way to perform NER-tagging; there are two popular alternatives, one is NLTK, and the other is the Stanford NER-tagger.
Before we start with our explanations, it is worth our while to briefly understand the term, chunking. It is the process of breaking up your sentence ...