O'Reilly logo

Java Deep Learning Projects by Md. Rezaul Karim

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Calculating the action mask

Here, we set all the outputs to 0, except the one for the action we actually saw, such that the network multiplies its outputs by a mask corresponding to the one-hot encoded action. We can then pass 0 as the target for all unknown actions, and our neural network should thus perform fine. When we want to predict all actions, we can simply pass a mask of all 1s:

// Get action maskint[] getActionMask(float[][] CurrMap) {        int retVal[] = { 1, 1, 1, 1 };        int agent = calcAgentPos(CurrMap); //agent current position        if(agent < size) // if agent's current pos is less than 4, action mask is set to 0            retVal[0] = 0;        if(agent >= (size * size - size)) // if agent's current pos is 12, we set action mask to 0 too retVal[1] = 0; ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required