Building the system was fun. Finding the factors that make the system go wrong is another story.
The model presented so far can be summed up as follows:
From lv to R, the process creates the reward matrix (Chapter 2, Think Like a Machine) required for the reinforcement learning program (Chapter 1, Become an Adaptive Thinker), which runs from reading R (reward matrix) to the results. Gamma is the learning parameter, Q is the Q learning function, and the results are the states of Q described in the first chapter.
The parameters to be measured are as follows:
- The company's input data. The training sets found on the Web such as MNIST ...