December 2018
Beginner to intermediate
226 pages
7h 59m
English
In the last section, we saw how Meta-SGD works. We saw how Meta-SGD obtains a better and robust model parameter
that's generalizable across tasks along with optimal learning rate and update direction. Now, we'll better understand Meta-SGD by coding them from scratch. Like we did in MAML, for better understanding, we'll consider a simple binary classification task. We randomly generate our input data and we train it with a simple single layer neural network and try to find the optimal parameter
. We'll see step-by-step ...