[NeurIPS'24] DIFO: Diffusion Imitation from Observations

The Official PyTorch implementation of DIFO: Diffusion Imitation from Observations (NeurIPS'24).

Bo-Ruei Huang, Chun-Kai Yang, Chun-Mao Lai, Dai-Jie Wu, Shao-Hua Sun
Robot Learning Lab, National Taiwan University
[Paper] [Website]

DIFO is a novel framework for imitation learning from observations that combines adversarial imitation learning with inverse dynamics regularization. It enables learning from expert observations without requiring expert actions.

@inproceeding{huang2024DIFO,
  author    = {Huang, Bo-Ruei and Yang, Chun-Kai and Lai, Chun-Mao and Wu, Dai-Jie and Sun, Shao-Hua},
  title     = {Diffusion Imitation from Observation},
  booktitle = {38th Conference on Neural Information Processing Systems (NeurIPS 2024)},
  year      = {2024},
}

Installation

Environment Setup

Python 3.10+
MuJoCo 2.1+ - Physics engine

conda create -n difo python=3.10 swig
conda activate difo

pip install -r requirements.txt

Wandb Setup

Setup Weights & Biases by first logging in with wandb login <YOUR_API_KEY>.

Alternatively, you can instead log to stdout by setting log_format_strs = ["stdout"] in scripts/ingredients/logging.py.

Download Datasets

Download the datasets from the Google Drive to the datasets/ directory.

gdown --id 1Bc9pXnJZxgFUhHwJUKE98Mras1TxTC5J -O datasets --folder

Training

We provide the configuration YAML files for training DIFO and other baselines in the exp_configs/ directory.

Including 7 tasks:

point_maze: PointMaze
ant_maze: AntMaze
walker: Walker
fetch_push: FetchPush
door: AdroitDoor
kitchen: OpenMicrowave
car_racing: CarRacing (Image-based)

and 11 algorithms:

difo: DIFO
difo-na: DIFO-NA
difo-uncond: DIFO-Uncond
bc: BC
bco: BCO
gaifo: GAIfO
AIRLfO: AIRLfO
waifo: WAILfO
ot-lfo: OT (LfO)
iq-lfo: IQ-Learn (LfO)
depo: DePO

Wandb Sweep

You can run the training scripts with Wandb sweep with the following commands:

./scripts/sweep <config_path>

# Example
./scripts/sweep exp_configs/point_maze/difo.yaml

Single Run

If you prefer to run a single experiment in terminal, you can refer the commands and the parameters in the YAML files. For example, to train DIFO on PointMaze, you can run the following command:

python -m scripts.train_adversarial difo with difo sac_il 1d_condition_diffusion_reward point_maze algorithm_kwargs.bce_weight=0.1 reward.net_kwargs.emb_dim=128

Code Attribution

This project builds heavily upon the imitation library. All code under the imitation/ directory is sourced from their project. We deeply appreciate their contributions to the field of imitation learning.

Acknowledgements

This work builds upon several excellent open-source projects:

imitation for core imitation learning algorithms and infrastructure
Gymnasium for environment interface
Stable-Baselines3 for RL algorithms
D4RL for environments and demonstrations
MuJoCo for physics simulation

License

This project is licensed under the MIT License. Key components have the following licenses:

Code in imitation/ directory follows the MIT License from imitation

Citation

If you use this code in your research, please cite:

@inproceeding{huang2024DIFO,
  author    = {Huang, Bo-Ruei and Yang, Chun-Kai and Lai, Chun-Mao and Wu, Dai-Jie and Sun, Shao-Hua},
  title     = {Diffusion Imitation from Observation},
  booktitle = {38th Conference on Neural Information Processing Systems (NeurIPS 2024)},
  year      = {2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
difo		difo
docs		docs
envs		envs
exp_configs		exp_configs
imitation		imitation
scripts		scripts
.gitignore		.gitignore
CITATION.bib		CITATION.bib
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[NeurIPS'24] DIFO: Diffusion Imitation from Observations

Installation

Environment Setup

Wandb Setup

Download Datasets

Training

Wandb Sweep

Single Run

Code Attribution

Acknowledgements

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

NTURobotLearningLab/DIFO

Folders and files

Latest commit

History

Repository files navigation

[NeurIPS'24] DIFO: Diffusion Imitation from Observations

Installation

Environment Setup

Wandb Setup

Download Datasets

Training

Wandb Sweep

Single Run

Code Attribution

Acknowledgements

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages