3
Representing Text – Capturing Semantics
Representing the meaning of words, phrases, and sentences in a form that’s understandable to computers is one of the pillars of NLP processing. Machine learning, for example, represents each data point as a list of numbers (a fixed-size vector), and we are faced with the question of how to turn words and sentences into these vectors. Most NLP tasks start by representing the text in some numeric form, and in this chapter, we show several ways to do that.
First, we will create a simple classifier to demonstrate the effectiveness of each method of encoding, and then we will use it to test the different encoding methods. We will also learn how to turn phrases such as fried chicken into vectors – that is, ...
Get Python Natural Language Processing Cookbook - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.