Part I of this book focuses on using tabular methods to represent the value functions of Markov decision processes (MDPs). However, this approach is limited to small-scale MDPs. In many real-world problems, the state space is too large for a table-based approach to be practical. There are two main reasons for this: first, the sheer number of states may require a large amount of memory to store the values; second, the learned value functions may not generalize well to new states.
To overcome these ...