Building word vectors using fastText

fastText is a library created by the Facebook Research Team for the efficient learning of word representations and sentence classification.

fastText differs from word2vec in the sense that word2vec treats every single word as the smallest unit whose vector representation is to be found, but fastText assumes a word to be formed by a n-grams of character; for example, sunny is composed of [sun, sunn, sunny],[sunny, unny, nny], and so on, where we see a subset of the original word of size n, where n could range from 1 to the length of the original word.

Another reason for the use of fastText would be that the words do not meet the minimum frequency cut-off in the skip-gram or CBOW models. For example, the ...

Get Neural Networks with Keras Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.