Skip to Main Content
Hands-On Artificial Intelligence for Beginners
book

Hands-On Artificial Intelligence for Beginners

by David Dindi, Patrick D. Smith
October 2018
Beginner content levelBeginner
362 pages
9h 32m
English
Packt Publishing
Content preview from Hands-On Artificial Intelligence for Beginners

Policy optimization

Policy optimization methods are an alternative to Q-learning and value function approximation. Instead of learning the Q-values for state/action pairs, these methods directly learn a policy π that maps state to an action by calculating a gradient. Fundamentally, for a search such as for an optimization problem, policy methods are a means of learning the correct policy from a stochastic distribution of potential policy actions. Therefore, our network architecture changes a bit to learn a policy directly:

Because every state has a distribution of possible actions, the optimization problem becomes easier. We no longer have ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Artificial Intelligence and Machine Learning Fundamentals

Artificial Intelligence and Machine Learning Fundamentals

Zsolt Nagy

Publisher Resources

ISBN: 9781788991063Supplemental Content