Skip to Content
Hands-On Transfer Learning with Python
book

Hands-On Transfer Learning with Python

by Dipanjan Sarkar, Raghav Bali, Tamoghna Ghosh
August 2018
Intermediate to advanced
438 pages
12h 3m
English
Packt Publishing
Content preview from Hands-On Transfer Learning with Python

Word2vec using gensim

The gensim framework, created by Radim Rehurek, consists of a robust, efficient, and scalable implementation of the Word2vec model (https://radimrehurek.com/gensim/models/word2vec.html). It allows us to chose either one of the skip-gram or CBOW models. Let's try to learn and visualize the word embedding for the IMDB corpora. As discussed before, this has 50,000 labeled documents and 50,000 unlabeled documents. For learning word representations, we don't need any labels and hence we can use all of the available 100,000 documents.

Let's first load the full corpora. The downloaded documents are divided into train, test, and unsup folders:

def load_imdb_data(directory = 'train', datafile = None):        ''' Parse IMDB review ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Hands-On Transfer Learning with TensorFlow 2.0

Hands-On Transfer Learning with TensorFlow 2.0

Margaret Maynard-Reid

Publisher Resources

ISBN: 9781788831307Supplemental Content