A simple implementation of mean actor critic on Lunar Lander and Cart Pole. Mean actor critic reduces the variance of policy gradient by marginalizing over actions when computing the gradient. See the paper for more details: https://arxiv.org/abs/1709.00503
-
Notifications
You must be signed in to change notification settings - Fork 2
kavosh8/MAC
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
A simple implementation of mean actor critic.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published