Skip to Content
Deep Learning for Natural Language Processing
book

Deep Learning for Natural Language Processing

by Stephan Raaijmakers
November 2022
Beginner to intermediate content levelBeginner to intermediate
296 pages
8h 27m
English
Manning Publications
Content preview from Deep Learning for Natural Language Processing

9 Transformers

This chapter covers

  • Understanding the inner workings of Transformers
  • Deriving word embeddings with BERT
  • Comparing BERT and Word2Vec
  • Working with XLNet

In late 2018, researchers from Google published a paper introducing a deep learning technique that would soon become a major breakthrough: Bidirectional Encoder Representations from Transformers, or BERT (Devlin et al. 2018). BERT aims to derive word embeddings from raw textual data just like Word2Vec, but does it in a much more clever and powerful manner: it takes into account both the left and right contexts when learning vector representations for words (figure 9.1). In contrast, Word2Vec uses a single piece of context. But this is not the only difference. BERT is grounded in ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Deep Learning for Natural Language Processing, 2nd Edition

Deep Learning for Natural Language Processing, 2nd Edition

Jon Krohn
Natural Language Processing in Action

Natural Language Processing in Action

Cole Howard, Hobson Lane, Hannes Hapke

Publisher Resources

ISBN: 9781617295447Supplemental ContentPublisher SupportOtherPublisher WebsiteSupplemental ContentErrata PagePurchase Link