Reinforcement learning algorithms with pytorch
Interact with the environment during training.
| algorithm | discrete control | continuous control |
|---|---|---|
| Deep Q-Network (DQN) | ✔ | ⛔ |
| Double DQN (DDQN) | ✔ | ⛔ |
| Deep Deterministic Policy Gradients (DDPG) | ⛔ | ✔ |
| Soft Actor-Critic (SAC) | ⛔ | ✔ |
Use the existing data set for training, and there is no interaction with the environment during training.
| algorithm | discrete control | continuous control |
|---|---|---|
| Conservative Q-Learning (CQL) | ✔ | ✔ |
| Batch-Constrained deep Q-learning (BCQ) | ✔ | |
| Policy in the Latent Action Space (PLAS) | ⛔ | ✔ |