January 2020
Intermediate to advanced
432 pages
10h 18m
English
We have already seen how we can tackle a continuous or infinite observation space by discretizing it into buckets. This works well but as we saw computationally, it does not scale well to massive problems of observation state space. By introducing DL, we can effectively increase our state space inputs, but not nearly in the amount we need. Instead, we need to introduce the concept of a partially observable Markov decision process (POMDP). That is, we can consider any problem that is an infinite MDP to be partially observable, meaning the agent or algorithm needs only observe the local or observed state in order to make actions. If you think about it, this is exactly the way you interact with your ...
Read now
Unlock full access