December 2018
Beginner to intermediate
684 pages
21h 9m
English
To train the network, we first randomly initialize all network parameters using a standard normal distribution (see notebook). For a given number of iterations or epochs, we run momentum updates and compute the training loss, as follows:
def train_network(iterations=1000, lr=.01, mf=.1): # Initialize weights and biases param_list = list(initialize_weights()) # Momentum Matrices = [MWh, Mbh, MWo, Mbo] Ms = [np.zeros_like(M) for M in param_list] train_loss = [loss(forward_prop(X, *param_list), Y)] for i in range(iterations): # Update the moments and the parameters Ms = update_momentum(X, Y, param_list, Ms, mf, lr) param_list = update_params(param_list, Ms) train_loss.append(loss(forward_prop(X, *param_list), Y)) return ...