9

Moving Beyond Foundation Models

Introduction

In previous chapters, we have focused on using or fine-tuning pre-trained models such as GPT, BERT, and Claude to tackle a variety of natural language processing and computer vision tasks. While these models have demonstrated state-of-the-art performance on a wide range of benchmarks, they may not be sufficient for solving more complex or domain-specific tasks that require a deeper understanding of the problem.

In this chapter, we explore the concept of constructing novel LLM architectures by combining existing models. By combining different models, we can leverage their strengths to create a hybrid architecture that either performs better than the individual models or performs a task that wasn’t ...

Get Quick Start Guide to Large Language Models: Strategies and Best Practices for ChatGPT, Embeddings, Fine-Tuning, and Multimodal AI, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.