April 2018
Intermediate to advanced
334 pages
10h 18m
English
Earlier, with online case-based planning, human traces provided by experts were the most important component in the learning process. These were provided by the experts to create a list of solutions. This created the case base and consumed high space storage. Moreover, it also came with a demerit that they didn't capture all possible traces, that is, combinations of states and actions specifically in case of continuous state-action spaces.
However, with reinforcement learning, storage of these traces is not required and moreover, the high dimensional and continuous state-action spaces can deal with a deep neural network, which incorporates them as input and outputs the optimal actions. Moreover, if the ...