Understanding the bag of words model
In this book, we are going to choose a popular machine learning model called the bag of words to represent a document. To give you a better idea about this concept, imagine that we take all the words in a document, throw them in a bag, shake them well, and take them out in no particular order. This new random document might not have a meaningful value to a human being, but to machines it has the same value as the original document. That's why we implemented all of those functions in our service so far.
Basically, when we demolish the grammar structure, it gives us freedom to focus more on the word instances, their weights, and how often they are repeated in the document. We will find out why and how we can benefit ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access