MAML algorithm

Now that we have a basic understanding of MAML, we will explore it in detail. Let's say we have a model f parameterized by θ—that is, fθ()—and we have a distribution over tasks, p(T). First, we initialize our parameter θ with some random values. Next, we sample some batch of tasks Ti from a distribution over tasksthat is, Ti ∼ p(T). Let's say we have sampled five tasks, T = {T1, T2, T3, T4, T5}, then, for each task Ti, we sample k data points and train the model. We do so by computing the loss and we minimize the loss using gradient descent and find the optimal set of parameters that minimize the loss:

In the previous equation, ...

Get Hands-On Meta Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.