A relation network consists of two important functions: the embedding function, denoted by , and the relation function, denoted by . The embedding function is used for extracting the features from the input. If our input is an image, then we can use a convolutional network as our embedding function, which will give us the feature vectors/embeddings of an image. If our input is a text, then we can use LSTM networks to get the embeddings of the text.
As we know, in one-shot learning, we have only a single ...