book

Deep Learning with PyTorch

by Vishnu Subramanian

February 2018

Intermediate to advanced

262 pages

6h 59m

English

Packt Publishing

Read now

Unlock full access

Content preview from Deep Learning with PyTorch

One-hot encoding

In one-hot encoding, each token is represented by a vector of length N, where N is the size of the vocabulary. The vocabulary is the total number of unique words in the document. Let's take a simple sentence and observe how each token would be represented as one-hot encoded vectors. The following is the sentence and its associated token representation:

An apple a day keeps doctor away said the doctor.

One-hot encoding for the preceding sentence can be represented into a tabular format as follows: