O'Reilly logo

Java Deep Learning Projects by Md. Rezaul Karim

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Calculating the reward

Now that the agent is provided with some guidance—reinforcementthe next task is to calculate the reward for each action the agent makes. Take a look at this code:

// Compute reward for an action float calcReward(float[][] CurrMap, float[][] NextMap) {        int newGoal = calcGoalPos(NextMap);// first, we calculate goal position for each map        if(newGoal == -1) // if goal position is the initial position (i.e. no move)            return (size * size + 1); // we reward the agent to 4*4+ 1 = 17 (i.e. maximum reward)        return -1f; // else we reward -1.0 for each bad move     }

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required