Skip to Content
Generative AI in Action
book

Generative AI in Action

by Amit Bahree
November 2024
Intermediate to advanced
464 pages
14h 38m
English
Manning Publications
Content preview from Generative AI in Action

12 Evaluations and benchmarks

This chapter covers

  • Understanding the significance of benchmarking and evaluating LLMs
  • Learning different evaluation metrics
  • Benchmarking model performance
  • Implementing comprehensive evaluation strategies
  • Best practices for evaluation benchmarks and key evaluation criteria to consider

Taking into account the recent surge of interest in GenAI and specifically in large language models (LLMs), it’s crucial to approach these novel and uncertain features cautiously and responsibly. Many leaderboards and studies have shown that LLMs can match human performance in various tasks, such as taking standardized tests or creating art, sparking enthusiasm and attention. However, their novelty and uncertainties necessitate ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Generative AI with LangChain

Generative AI with LangChain

Ben Auffarth
Introduction to Generative AI

Introduction to Generative AI

Numa Dhamani, Maggie Engler

Publisher Resources

ISBN: 9781633436947Supplemental ContentPublisher SupportOtherPublisher WebsiteSupplemental ContentPurchase Link