Chapter 12: Evaluating LLMs

For any Machine Learning problem, the final result boils down to the metrics. If everything goes great but eventually the metrics are bad, the whole project can be shelved. So evaluation metrics are amongst the most important aspects of any ML-based project. When it comes to LLMs, this becomes even more crucial as at times, LLMs hallucinate and you don’t know whether the answer is right or wrong. But evaluating LLMs isn’t as straightforward as you think. Assume your ground truth for some problem statement is 11, but the LLM can give the following answers:

Eleven
The answer is Eleven
…………..Eleven………
The answer is 11
11 is the answer

& many other variations

This is a common issue we have, with not just LLMs but ...

Get LangChain in your Pocket now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

LangChain in your Pocket by Mehul Gupta

Chapter 12: Evaluating LLMs

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly