Chapter 11. RAG Web Apps
Most generative AI projects start as experiments, and most never reach production. Fast prototyping matters, especially when your team lacks full-stack engineering support. You need a setup that lets data scientists build RAG-powered apps that real users can test before committing to production infrastructure.
Streamlit is a Python framework for rapid web app development. It lets you build functional interfaces with a single script that covers the frontend and backend. The framework provides prebuilt components you can assemble without low-level web development, though you’re limited to Streamlit’s current offerings.
Streamlit reruns your entire script on every user interaction. This simplifies development but creates overhead that doesn’t scale beyond a few dozen concurrent users. For validating whether a GenAI app provides value, this trade-off works in your favor.
For production applications with thousands or millions of users, alternative frameworks provide better scalability and performance. Django offers full-featured capabilities with authentication and complex workflows. Flask provides a lightweight and flexible foundation. FastAPI is optimized for API endpoints.
The recipes in this chapter build progressively. You’ll start by creating a basic Streamlit app to understand the framework’s core mechanics, then build a working RAG chatbot, and finally extend it to handle document uploads and queries.
Tip
In the early days of RAG, many applications ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access