Chapter 5. Governance of AI
In the previous chapters, we learned how to develop LLM application prototypes, deploy RAG applications that consist of LLM and (vector) database components, and monitor applications. This chapter is slightly different in that we will discuss AI applications from a governance perspective, including cost and data considerations as well as privacy, legal, ethical, and safety concerns.
While legal and regulatory aspects are important and deserving of a separate discussion, this chapter will focus on the aspects of governance that developers who are building AI applications commonly encounter.
Cost Management
In Chapter 4, we discussed how closed-source LLM costs typically scale linearly with the number of tokens. As of September 2024, for example, GPT-4o costs $5/1M input tokens and $15/1M output tokens. Open source models, however, have costs that scale more with infrastructure usage, or cloud costs, depending on hosting options, as we discussed in Chapter 3.
In addition to inferred costs from LLM usage, RAG applications have costs associated with embeddings retrieval. Again there are closed-source embedding APIs, such as OpenAI’s text-embedding-3-small that costs $0.20/1M tokens, and open source models that can be hosted locally or using cloud providers. There are also costs associated with data storage, including hosting and storing document embeddings in vector databases.
For many organizations, it is becoming increasingly important to harness the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access