O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Encoding the text

Next, a function is created to convert the list of strings into vectors. The character set is made first by concatenating the English characters, and the dictionary is created with those characters as keys, while the integers are the values:

def get_encoded_x(train_x1, train_x2, test_x1, test_x2):    chars = string.ascii_lowercase + '? ()=+-_~"`<>,./\|[]{}!@#$%^&*:;' + "'"

The preceding example is just a set of characters from the English language. Other language characters can also be included, to make the approach generic for different languages. 

Note that this character set can be inferred from the dataset, to include non-English characters.

Next, a character map is formed between the set of characters and integers. The ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required