Keras Reinforcement Learning Projects
by Giuseppe Ciaburro, Sudharsan Ravichandiran, Suriyadeepan Ramamoorthy
Dynamic Programming
DP is a mathematical methodology developed in the 1950s, mainly by Richard Bellman. It allows us to address certain classes of problems in which a series of interdependent decisions must be taken in sequence. It is based on Bellman's principle of optimality, Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003: Dover, ISBN 0-486-42809-5, which states that an optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.
Consider, for example, the problem of finding the best path that joins two locations. The principle of optimality ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access