Chapter 5. Building with Search
In this chapter, we’ll outline how you can put these concepts into practice. For the search technology, we’ll discuss OpenSearch and Amazon OpenSearch Service, but you could similarly employ any commercial or open source solution.
OpenSearch, like most search engines, is a distributed system, deployed on clusters of instances in different roles. When you use it, you run the OpenSearch process on these nodes, and OpenSearch discovers the other nodes to form a cluster. OpenSearch provides automatic data replication for better durability and to parallelize the workload across compute, memory, and storage resources. You can run OpenSearch on your own or use a hosted version, such as Amazon OpenSearch Service. Amazon OpenSearch Service takes care of undifferentiated tasks like deploying hardware, installing and upgrading OpenSearch software, and monitoring for and repairing defects in your cluster.
Amazon OpenSearch Service has included a vector engine since 2021, when the service made a kNN plug-in available that provides the data structures and algorithms to support storing and searching vectors. OpenSearch’s Neural plug-in simplifies the process of generating and attaching vector embeddings to documents and creating the embeddings for user queries.
ML-Powered Search
As an example, consider an ecommerce website, built to bring products to customers who want to buy those products. Search engines have played a primary role in retrieving relevant products, ...
Get Natural Language and Search now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.