What is word embedding?
Word embedding is a feature engineering technique that's used in the context of NLP. In short, each vocabulary is mapped to the fixed-length vector of real numbers. A word is traditionally treated as a discrete atomic value in NLP processing. This type of encoding can be arbitrary and lacks useful information representing the relationships among vocabularies. It raises the issue of sparse data, which tends to require more data to train the statistical model successfully. Here, we would use word embedding by using vector space models to convert the discrete values into a continuous vector space while keeping its semantics. For example, a five-dimensional embedding of English word vocabulary can be seen in the following ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access