Building and training a model is one thing; deploying your model in a production system is a different and often overlooked story. Running code in a Python notebook is nice, but not a great way to serve web clients. In this chapter we’ll look at how to get up and running for real.
We’ll start with embeddings. Embeddings have played a role in many of the recipes in this book. In Chapter 3, we looked at the interesting things we can do with word embeddings, like finding similar words by looking at their nearest neighbors or finding analogues by adding and subtracting embedding vectors. In Chapter 4, we used embeddings of Wikipedia articles to build a simple movie recommender system. In Chapter 10, we saw how we can treat the output of the final layer of a pretrained image classification network as embeddings for the input image and use this to build a reverse image search service.
Just as with these examples, we find that real-world cases often end with embeddings for certain entities that we then want to query from a production-quality application. In other words, we have a set of images, texts, or words and an algorithm that for each produces a vector in a high-dimensional space. For a concrete application, we want to be able to query this space.
We’ll start with a simple approach: we’ll build a nearest neighbor model and save it to disk, so we can load it when we need it. We’ll then look at using Postgres for the same purpose. ...