book

Numerical Computing with Python

by Pratap Dangeti, Allen Yu, Claire Chung, Aldrin Yim

December 2018

Beginner to intermediate

682 pages

18h 1m

English

Packt Publishing

Read now

Unlock full access

Content preview from Numerical Computing with Python

Markov decision processes and Bellman equations

Markov decision process (MDP) formally describes an environment for reinforcement learning. Where:

Environment is fully observable
Current state completely characterizes the process (which means the future state is entirely dependent on the current state rather than historic states or values)
Almost all RL problems can be formalized as MDPs (for example, optimal control primarily deals with continuous MDPs)

Central idea of MDP: MDP works on the simple Markovian property of a state; for example, S_t+1 is entirely dependent on latest state S_t rather than any historic dependencies. In the following equation, the current state captures all the relevant information from the history, which means ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Mastering Numerical Computing with NumPy

Umit Mert Cakmak, Tiago Antao, Mert Cuhadaroglu

Numerical Python: Scientific Computing and Data Science Applications with Numpy, SciPy and Matplotlib

Robert Johansson

Python Machine Learning Cookbook - Second Edition

Giuseppe Ciaburro, Prateek Joshi

Hands-On Deep Learning Algorithms with Python

Sudharsan Ravichandiran

Publisher Resources

ISBN: 9781789953633Other Other Errata Page