January 2018
Beginner to intermediate
284 pages
8h 35m
English
Both value-based and policy-based algorithms act independently without learning from each other. Actor-critic-based algorithms aim to improve upon this drawback. This algorithm combines value functions with policy iteration algorithms, so that the policy is updated at every iteration based on the updated value function. Figure, Actor-critic-based reinforcement learning, illustrates this workflow in more detail:

In the preceding figure, you start with an initial policy by an Actor (policy-based algorithm). The Critic or the value function receives the new state as well ...
Read now
Unlock full access