Chapter 6. Vector Databases and Similarity Searches
Vector databases have grown rapidly in popularity, especially since the rise of LLMs. Figure 6-1 shows some of the most popular vector databases and when they were released.
FAISS was released in 2017, followed by Milvus and Weaviate in 2019, Vald in 2020, Pinecone in 2021, and Chroma in 2023. At the same time, traditional databases such as PostgreSQL and Elasticsearch have added vector search features to their existing platforms. This means you don’t necessarily need to add a dedicated vector database to your tech stack if your existing SQL or NoSQL database already supports vector search.
Figure 6-1. The evolution of vector stores
This chapter introduces popular libraries and databases that support vector operations, including FAISS, Chroma, and PostgreSQL. You’ll learn how to choose the right vector store for your needs, implement similarity searches, and optimize performance by using indexing techniques.
Note
Throughout this chapter, you’ll encounter both “similarity search” and “semantic search.” Similarity search refers to the technical operation of finding vectors that are close together in vector space. Semantic search describes the user-facing capability of searching by meaning rather than exact keywords. Semantic search is implemented using similarity search as its underlying mechanism.
You can find all the code examples ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access