8 Deep transfer learning for NLP with BERT and multilingual BERT

This chapter covers

  • Using pretrained Bidirectional Encoder Representations from Transformers (BERT) architecture to perform some interesting tasks
  • Using the BERT architecture for cross-lingual transfer learning

In this chapter and the previous chapter, our goal is to cover some representative deep transfer learning modeling architectures for natural language processing (NLP) that rely on a recently popularized neural architecture—the transformer1—for key functions. This is arguably the most important architecture for NLP today. Specifically, our goal has to look at modeling frameworks such as the generative pretrained transformer (GPT),2 Bidirectional Encoder Representations ...

Get Transfer Learning for Natural Language Processing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.