SEAC_Pytorch_release

The code is for AAAI-DAI 2024 paper: Deployable Reinforcement Learning with Variable Control Rate.

Model and Test Environment Architecture

We implement our variable control rate method through the SAC algorithm. We called this method Soft Elastic Actor and Critic (SEAC). It allows the agent to execute actions with elastic times for every time step.

The core of this algorithm is to follow the principle of reaction control and change the execution time of each action of the agent from the classical fixed value of almost all RLs to a more reasonable variable value within a suitable time range. Since this method reduces the number of data, the compute load would be dramatically decreased. It helps RL models to deploy on the weak compute resources platform. The implementation structure of this code is shown in the figure below.The implementation structure of this code is shown in the figure below. For more details, please go to the paper.

Our result has been verified within this Newton gymnasium environment (see the figure below). For more details about this environment

Following these steps, you can reproduce the result in our paper.

OS Environment

All commends in this page are based on the Ubuntu 20.04 OS. You may need to adjust some commands to fit other Linux, Windows, or macOS.

Remote training with docker

We have already made a docker file for you. What you need to do is to launch it to build your docker image. You are welcome to change the path yourself. You can build the docker image by:

docker image build [OPTIONS] PATH_TO_DOCKERFILE

Then, you can launch it to the dockerhub or somewhere and transform it to your remote PC. And start training.

A tutorial on how to use docker.

A tutorial on how to use cuda with docker.

Local training with your PC

If you want to train the model locally, and you don't want to speed up the training with local GPU(s), you need to install PyTorch first, then you can directly run:

cd PATH_TO_YOUR_FOLDER
RUN pip3 install -r requirement.txt
python3 main.py

If you want to speed up your training with GPU(s), you need to find out your Nvidia Driver version and corresponding Cuda and CuDNN versions, then install them first. Next, install the corresponding PyTorch version after all the Nvidia and PyTorch environments are well setting. Finally, you can run:

cd PATH_TO_YOUR_FOLDER
RUN pip3 install -r requirement.txt
python3 main.py

Additionally, you can enable (by default) or disable the variable control rate by:

python3 main.py --fix_freq=0 or python3 main.py --fix_freq=1

For more parameter settings, please refer to the comments in the code.

We have tested our code on a PC with a Intel 13600K CPU and a NVIDIA RTX 4070 GPU, with the following software versions:

Cuda: 11.8
CuDNN: 8.7.0
Driver: 535.104.05
Pytorch: 2.0.1+cu118

The results are shown in the following images:

Average Returns

Average returns for three algorithms trained in 1.2 million steps. The figure on the right is a partially enlarged version of the figure on the left.

Average Time Cost

Average time cost per episode for three algorithms trained in 1.2 millions steps. The figure on the right is a partially enlarged version of the figure on the left.

SEAC Model Explanation:

Four example tasks show how SEAC changes the control rate dynamically to adapt to the Newtonian mechanics environment and ultimately reasonably complete the goal.

Energy cost:

The energy cost for 100 trials. SEAC consistently reduces the number of time steps compared with PPO and SAC without affecting the overall average reward. Therefore, SAC and PPO are not optimizing for energy consumption and have a much larger spread.

More explanation, implementation and parameters related details, please refer to our paper.

License

MIT

Contact Information

Author: Dong Wang (dong-1.wang@polymtl.ca), Giovanni Beltrame (giovanni.beltrame@polymtl.ca)

And welcome to contact MISTLAB for more fun and practical robotics and AI related projects and collaborations. :)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.idea		.idea
envs		envs
img		img
model		model
runs		runs
.gitignore		.gitignore
Adapter.py		Adapter.py
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
ReplayBuffer.py		ReplayBuffer.py
SEAC.py		SEAC.py
main.py		main.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SEAC_Pytorch_release

Model and Test Environment Architecture

OS Environment

Remote training with docker

Local training with your PC

Average Returns

Average Time Cost

SEAC Model Explanation:

Energy cost:

License

Contact Information

About

Uh oh!

Releases

Packages

Languages

License

alpaficia/SEAC_Pytorch_release

Folders and files

Latest commit

History

Repository files navigation

SEAC_Pytorch_release

Model and Test Environment Architecture

OS Environment

Remote training with docker

Local training with your PC

Average Returns

Average Time Cost

SEAC Model Explanation:

Energy cost:

License

Contact Information

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages