January 2018
Beginner to intermediate
284 pages
8h 35m
English
Extracting useful information for text-based information is no easy task. For a basic application, such as document classification, the common way of feature extraction is called bag of words (BoW), in which the frequency of the occurrence of each word is used as a feature for training the classifier. We will briefly talk about BoW in the following section, as well as the tf-idf approach, which is intended to reflect how important a word is to a document in a collection or corpus.
Read now
Unlock full access