O'Reilly logo

Java Deep Learning Projects by Md. Rezaul Karim

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Weight optimization

Before the training starts, the network parameters are set randomly. Then to optimize the network weights, an iterative algorithm called Gradient Descent (GD) is used. Using GD optimization, our network computes the cost gradient based on the training set. Then, through an iterative process, the gradient G of the error function E is computed.

In following graph, gradient G of error function E provides the direction in which the error function with current values has the steeper slope. Since the ultimate target is to reduce the network error, GD makes small steps in the opposite direction -G. This iterative process is executed a number of times, so the error E would move down towards the global minima. This way, the ultimate ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required