Chapter 3. Observable AI
Now that you’ve deployed your AI application, it is time to sit back, relax, and let customers have a seamless experience with your model. Seamless because after all - haven’t you evaluated your model offline on representative data and load tested it prior to deployment in production? Well that is often not the case. In traditional software applications, we care mostly about operational metrics (latency and throughput). But for AI applications, in addition to operational metrics, we also care about quality and performance.
Here’s an example of a case where performance is impacted. Let’s say, for example, that a product website builds a recommendation system. The performance is great initially, as customers find recommendations useful and sales go up. But a week later, performance starts to go down. It is so bad that the model did worse than the previous simplistic model. What happened here? A few weeks of ...
Get Retrieval Augmented Generation in Production with Haystack now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.