O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Word2Vector

The Word2Vec tools take text data as input and produce the word vectors as output. This tool constructs a vocabulary from the training text data and learns vector representation of words. The resulting word vector file can be used as features for many natural language processing and machine learning applications.

The easiest way to investigate the learned representations is to find the closest words for a user-specified word.

Word2Vec implementation in Apache Spark computes distributed vector representation of words. Apache Spark's implementation is a more scalable approach as compared to single machine Word2Vec implementations provided by Google).

(https://code.google.com/archive/p/word2vec/)

Word2Vec can be implemented using ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required