November 2022
Beginner to intermediate
296 pages
8h 27m
English
This chapter covers
In late 2018, researchers from Google published a paper introducing a deep learning technique that would soon become a major breakthrough: Bidirectional Encoder Representations from Transformers, or BERT (Devlin et al. 2018). BERT aims to derive word embeddings from raw textual data just like Word2Vec, but does it in a much more clever and powerful manner: it takes into account both the left and right contexts when learning vector representations for words (figure 9.1). In contrast, Word2Vec uses a single piece of context. But this is not the only difference. BERT is grounded in ...