GENERALIZED CRITIC POLICY OPTIMIZATION: A MODEL FOR COMBINING ADVANTAGE ESTIMATES IN ACTOR CRITIC METHODS
Roumeissa Kitouni, Abderrahim Kitouni, Feng Jiang
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 10:24
We present a general model for actor-critic methods that represent the possibility of combining value function estimations as a means to further reduce the policy gradient's variance and improve the learning result. We show the potential of this architecture by implementing an example case to learn some of the Pybullet continuous control robotic tasks with OpenAI Gym. We show by experimenting with a special case the effect of the external parameters on the overall performance of the policy optimization algorithm.