October 2019
Intermediate to advanced
366 pages
12h 4m
English
Advantages of both value functions and policy gradient algorithms can be merged, creating hybrid algorithms that can be more sample efficient and robust.
Hybrid approaches combine Q-functions and policy gradients to symbiotically and mutually improve each other. These methods estimate the expected Q-function of deterministic actions to directly improve the policy.
Read now
Unlock full access