Skip to Content
Quick Start Guide to Large Language Models: Strategies and Best Practices for ChatGPT, Embeddings, Fine-Tuning, and Multimodal AI, 2nd Edition
book

Quick Start Guide to Large Language Models: Strategies and Best Practices for ChatGPT, Embeddings, Fine-Tuning, and Multimodal AI, 2nd Edition

by Sinan Ozdemir
October 2024
Intermediate to advanced content levelIntermediate to advanced
384 pages
13h 7m
English
Addison-Wesley Professional
Content preview from Quick Start Guide to Large Language Models: Strategies and Best Practices for ChatGPT, Embeddings, Fine-Tuning, and Multimodal AI, 2nd Edition

12

Evaluating LLMs

Introduction

Admittedly we’ve spent a vast majority of this book building, thinking about, and iterating our LLM systems, and not as much time establishing rigorous and structured tests against those systems. That being said, we have seen evaluation at play throughout this entire book in bits and pieces. We evaluated our fine-tuned recommendation engine by judging the recommendations it gave out, we tested our classifiers against metrics like accuracy and precision, and we validated our chat-aligned SAWYER and T5 models against our reward mechanisms and even on some benchmarks.

This chapter aggregates all of these evaluation techniques, while adding on to the list. That’s because, at the end of the day, no matter how well ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Build a Large Language Model (From Scratch)

Build a Large Language Model (From Scratch)

Sebastian Raschka

Publisher Resources

ISBN: 9780135346570