book

Mastering Transformers

Name: Mastering Transformers
ISBN: 9781801077651

by Savaş Yıldırım, Meysam Asgari- Chenaghlu

September 2021

Beginner to intermediate

374 pages

7h 35m

English

Packt Publishing

Read now

Unlock full access

Mastering Transformers
ContributorsAbout the authorsAbout the reviewer
Preface
Who this book is forWhat this book coversTo get the most out of this bookDownload the example code filesCode in ActionDownload the color imagesConventions usedGet in touchShare Your Thoughts
Section 1: Introduction – Recent Developments in the Field, Installations, and Hello World Applications
Chapter 1: From Bag-of-Words to the Transformer
Technical requirementsEvolution of NLP toward TransformersUnderstanding distributional semanticsBoW implementationOvercoming the dimensionality problemLanguage modeling and generationLeveraging DLLearning word embeddings A brief overview of RNNsLSTMs and gated recurrent unitsA brief overview of CNNsOverview of the Transformer architectureAttention mechanismMulti-head attention mechanismsUsing TL with TransformersSummaryReferences
Chapter 2: A Hands-On Introduction to the Subject
Technical requirementsInstalling Transformer with AnacondaInstallation on LinuxInstallation on WindowsInstallation on macOSInstalling TensorFlow, PyTorch, and TransformerInstalling using Google ColabWorking with language models and tokenizersWorking with community-provided modelsWorking with benchmarks and datasets Important benchmarksAccessing the datasets with an Application Programming InterfaceBenchmarking for speed and memorySummary
Section 2: Transformer Models – From Autoencoding to Autoregressive Models
Chapter 3: Autoencoding Language Models
Technical requirementsBERT – one of the autoencoding language modelsBERT language model pretraining tasksA deeper look into the BERT language modelAutoencoding language model training for any languageSharing models with the communityUnderstanding other autoencoding modelsIntroducing ALBERT RoBERTaELECTRAWorking with tokenization algorithmsByte pair encoding WordPiece tokenizationSentence piece tokenizationThe tokenizers librarySummary
Chapter 4:Autoregressive and Other Language Models
Technical requirementsWorking with AR language modelsIntroduction and training models with GPTTransformer-XLXLNetWorking with Seq2Seq modelsT5Introducing BARTAR language model trainingNLG using AR modelsSummarization and MT fine-tuning using simpletransformersSummaryReferences
Chapter 5: Fine-Tuning Language Models for Text Classification
Technical requirementsIntroduction to text classificationFine-tuning a BERT model for single-sentence binary classificationTraining a classification model with native PyTorch Fine-tuning BERT for multi-class classification with custom datasetsFine-tuning the BERT model for sentence-pair regression Utilizing run_glue.py to fine-tune the modelsSummary
Chapter 6: Fine-Tuning Language Models for Token Classification
Technical requirementsIntroduction to token classificationUnderstanding NERUnderstanding POS taggingUnderstanding QAFine-tuning language models for NERQuestion answering using token classificationSummary

Chapter 7: Text Representation
Technical requirementsIntroduction to sentence embeddingsCross-encoder versus bi-encoderBenchmarking sentence similarity modelsUsing BART for zero-shot learningSemantic similarity experiment with FLAIRAverage word embeddingsRNN-based document embeddingsTransformer-based BERT embeddingsSentence-BERT embeddingsText clustering with Sentence-BERTTopic modeling with BERTopicSemantic search with Sentence-BERTSummaryFurther reading
Section 3: Advanced Topics
Chapter 8: Working with Efficient Transformers
Technical requirementsIntroduction to efficient, light, and fast transformersImplementation for model size reductionWorking with DistilBERT for knowledge distillationPruning transformersQuantizationWorking with efficient self-attentionSparse attention with fixed patternsLearnable patternsLow-rank factorization, kernel methods, and other approachesSummaryReferences
Chapter 9:Cross-Lingual and Multilingual Language Modeling
Technical requirementsTranslation language modeling and cross-lingual knowledge sharingXLM and mBERTmBERTXLM Cross-lingual similarity tasksCross-lingual text similarityVisualizing cross-lingual textual similarityCross-lingual classificationCross-lingual zero-shot learningFundamental limitations of multilingual modelsFine-tuning the performance of multilingual modelsSummaryReferences
Chapter 10: Serving Transformer Models
Technical requirementsfastAPI Transformer model servingDockerizing APIsFaster Transformer model serving using TFXLoad testing using LocustSummaryReferences
Chapter 11: Attention Visualization and Experiment Tracking
Technical requirementsInterpreting attention headsVisualizing attention heads with exBERTMultiscale visualization of attention heads with BertVizUnderstanding the inner parts of BERT with probing classifiersTracking model metricsTracking model training with TensorBoardTracking model training live with W&BSummaryReferences
Why subscribe?
Other Books You May EnjoyLeave a review - let other readers know what you thinkShare Your Thoughts

Content preview from Mastering Transformers

Chapter 8: Working with Efficient Transformers

So far, you have learned how to design a Natural Language Processing (NLP) architecture to achieve successful task performance with transformers. In this chapter, you will learn how to make efficient models out of trained models using distillation, pruning, and quantization. Second, you will also gain knowledge about efficient sparse transformers such as Linformer, BigBird, Performer, and so on. You will see how they perform on various benchmarks, such as memory versus sequence length and speed versus sequence length. You will also see the practical use of model size reduction.

The importance of this chapter came to light as it is getting difficult to run large neural models under limited computational ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Transformers for Natural Language Processing

Publisher Resources

ISBN: 9781801077651

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Mastering Transformers

by Savaş Yıldırım, Meysam Asgari- Chenaghlu

Chapter 8: Working with Efficient Transformers

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.