Now, we'll see how to use Meta-SGD in a supervised learning setting. Like MAML, we can apply Meta-SGD to any of the supervised learning problems, be it regression or classification, that can be trained with gradient descent. First, we need to define the loss function we wish to use. For example, if we're performing classification, we can use cross-entropy as our loss function and, if we're performing regression, we can use mean squared error as our loss function. We can use any loss function suitable for our tasks. Let's go through this step-by-step:
- Let's say we have a model parameterized by a parameter ...