O'Reilly logo

Effective Amazon Machine Learning by Alexis Perrier

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Text mining

The N-gram and the orthogonal sparse bigram (OSB) transformations are the main text-mining transformations available in Amazon ML.

In text mining, the classic approach is called the bag-of-words approach. This approach boils down to discarding the order of the word in a given text and only considering the relative frequency of the words in the documents. Although it may seem to be overly simplistic, since the order of the words is essential to understand a message, this approach has given satisfying results in all types of natural language processing problems. A key part of the bag-of-words method, is driven by the need to extract the words from a given text. However, instead of considering single words as the only elements holding ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required