Meta-SGD

Let's say we have some task, T. We use a model, , parameterized by some parameter, , and train the model to minimize the loss. We minimize the loss using gradient descent and find the optimal parameter for the model.

Let's recall the update rule of a gradient descent:

So, what are the key elements that make up our gradient descent? Let's ...

Get Hands-On Meta Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.