12 Working with Efficient Transformers

So far, you have learned how to design a Natural Language Processing (NLP) architecture to achieve successful task performance with transformers. In this chapter, you will learn first how to make efficient models out of trained models using distillation, pruning, and quantization. Second, you will also gain knowledge about efficient sparse transformers such as Linformer, BigBird, and Performer. You will see how they perform on various benchmarks, such as memory versus sequence length and speed versus sequence length. You will also see the practical use of model size reduction.

The importance of this chapter came to light as it is getting difficult to run large neural models under limited computational capacity. ...

Get Mastering Transformers - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Mastering Transformers - Second Edition by Savaş Yıldırım, Meysam Asgari- Chenaghlu

12

Working with Efficient Transformers

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly