Recent advances in machine learning have lowered the barriers to creating and using ML models. But understanding what these models are doing has only become more difficult. We discuss technological advances with little understanding of how they work and struggle to develop a comfortable intuition for new functionality.
In this report, authors Austin Eovito and Marina Danilevsky from IBM focus on how to think about neural network-based language model architectures. They guide you through various models (neural networks, RNN/LSTM, encoder-decoder, attention/transformers) to convey a sense of their abilities without getting entangled in the complex details. The report uses simple examples of how humans approach language in specific applications to explore and compare how different neural network-based language models work.
This report will empower you to better understand how machines understand language.
- Dive deep into the basic task of a language model to predict the next word, and use it as a lens to understand neural network language models
- Explore encoder-decoder architecture through abstractive text summarization
- Use machine translation to understand the attention mechanism and transformer architecture
- Examine the current state of machine language understanding to discern what these language models are good at and their risks and weaknesses
Table of contents
- 1. Introduction: What Is It like to Be a Language Model?
2. Meet the Neural Model Family
- What Do Humans Want to Remember?
- Neural Networks for Language Modeling
- Considerations on the Use of RNNs
- Key Takeaways
- 3. Two Heads Are Better than One: Encoder-Decoder Architecture
- 4. Choosing What to Care About: Attention and Transformers
- 5. Machine Language Understanding
- About the Authors
- Title: Language Models in Plain English
- Release date: October 2021
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098109066
You might also like
Hands-On Large Language Models
AI has acquired startling new language capabilities in just the past few years. Driven by the …
Designing Large Language Model Applications
Transformer-based language models are powerful tools for solving a variety of language tasks and represent a …
Natural Language Processing in Action
Natural Language Processing in Action is your guide to creating machines that understand human language using …
Transformers for Natural Language Processing
Publisher's Note: A new edition of this book is out now that includes working with GPT-3 …