Skip to content

Official Repository of “MotionTrans: Human VR Data Enable Motion-Level Learning for Robotic Manipulation Policies"

License

Notifications You must be signed in to change notification settings

michaelyuancb/motiontrans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MotionTrans: Human VR Data Enable Motion-Level Learning for Robotic Manipulation Policies

Chengbo Yuan, Rui Zhou^, Mengzhen Liu^, Yingdong Hu, Shengjie Wang, Li Yi, Chuan Wen, Shanghang Zhang, Yang Gao*.

[Project Website] [Arxiv] [Dataset] [Pi0-VLA Code] [BibTex]

^ Indicates equal contribution. * Corresponding author.

motiontrans concept teaser

Introduction

MotionTrans is the first framework that achieve explicit end-to-end human-to-robot motion transfer, established motion-level policy learning from human data. By cotraining on 15 robot tasks and 15 human tasks, we enable both Diffusion Policy and Pi0-VLA to directly perform 10+ human tasks. Here we open-source all codes of the framework, including

  • (1) Robot teleoperation.
  • (2) Human data collection.
  • (3) MotionTrans Dataset.
  • (4) Human-robot data processing.
  • (5) Human data replay (on robot).
  • (6) Policy human-robot cotraining / finetuning.
  • (7) Checkpoints (Weights) of cotrained policy.
  • (8) Policy inference and deployment (on robot).

Installation

conda create -n dexmimic python=3.10    
codna activate dexmimic 
pip install -r requirments.txt
pip install torch torchvision peft open3d viser
pip install huggingface-hub==0.21.4 pin==3.3.1 numpy==1.24.4

Since we rely on ZED2 camera for visual observation, please also install the ZED SDK following the official instruction.

(1) Robot Teleoperation

Please following documents/1.robot_teleoperation.md

(2) Human Data Collection

Please following documents/2.human_data_collection.md

(3) MotionTrans Dataset

Please download the dataset from this huggingface link: MotionTrans Dataset. The details of the dataset can be found in documents/3.motiontrans_dataset.md

(4) Human-Robot Data Processing

For data processing:

bash scripts_data/zarr_human_data_conversion_batch.sh
bash scripts_data/zarr_robot_data_conversion_batch.sh

This will process all tasks in the raw data folder and save the processed data in zarr format. For instruction augmentation (with OpenAI-ChatGPT), check out scripts/zarr_get_diverse_instruction.sh. For data visualization, first run visualization-version processing scripts for a single task:

bash scripts_data/zarr_human_data_conversion_vis.sh
bash scripts_data/zarr_robot_data_conversion_vis.sh

And then run the visualization script (remember to update the data_path in the .sh file to the zarr folder generated before):

bash scripts_data/data_visualization.sh

This will open a window to visualize the overlapped pointclouds for checking and an interactive viser visualization for detailed inspection. The visualization results are shown below:

motiontrans visualization

(5) Human Data Replay (on Robot)

To replay the processed human data on the robot, run:

bash scripts/replay.sh

And then follow the instruction in your terminal to control the replay process.

(6) Policy Human-Robot Cotraining / Finetuning

We provide Diffusion Policy codebase in this repository. For Pi0-VLA, please refer to the MotionTrans-Pi0-VLA.

For human-robot multi-task cotraining (zero-shot setting in the paper), run:

bash scripts/dp_base_cotraining.sh

The checkpoints will be saved in .ckpt file. Allow with the checkpoints, a .yaml recording (the order of) all training tasks will also be saved, which will be used for later policy inference and robot control.

The cotrained checkpoints can be downloaded from here.

For few-shot robot demonstrations finetuning (few-shot setting in the paper), run:

bash scripts/dp_base_finetune_5demo.sh
bash scripts/dp_base_finetune_20demo.sh

(7) Policy Inference and Deployment (on Robot)

To deploy trained Diffusion Policy (DP) on the robot, run:

bash scripts/dp_infer.sh

and follow the instruction in your terminal to control the robot execution. We note that the parameters and tricks for action-chunk-based inference affect the performance a lot. Please refer to the comments in scripts/dp_infer.sh for parameter details. You should especially pay attention to robot_action_horizon, robot_steps_per_inference, gripper_action_horizon, and gripper_steps_per_inference, which should be set carefully to get a balance between inference horizon and action jittering (mode discussion and potential improvement could refer to blog of Real-Time Action Chunking).

To deploy trained Pi0-VLA on the robot, first start the policy server (refer to MotionTrans-Pi0-VLA), and then run:

bash scripts/pi0_infer.sh

and follow the instruction in your terminal to control the robot execution. The scripts above start a client to communicate with the policy server, and then control the robot execution. More details could refer to official Pi0-VLA repository.

Acknowledgment

This repository is based on the code from Data-Scaling-Laws, UMI, Open-Television, ARCap, OpenPi, Viser, RoboEngine, OneTwoVLA, EgoHOI, DROID and Pinocchio. We sincerely appreciate their contribution to the open-source community, which have significantly supported this project. We also sincerely thank our AI-collaborators ChatGPT, Kimi and Github Copilot !!

Citation

If you find this repository useful, please kindly acknowledge our work :

@article{yuan2025motiontrans,
  title={MotionTrans: Human VR Data Enable Motion-Level Learning for Robotic Manipulation Policies},
  author={Yuan, Chengbo and Zhou, Rui and Liu, Mengzhen and Hu, Yingdong and Wang, Shengjie and Yi, Li and Wen, Chuan and Zhang, Shanghang and Gao, Yang},
  journal={arXiv preprint arXiv:2509.17759},
  year={2025}
}

About

Official Repository of “MotionTrans: Human VR Data Enable Motion-Level Learning for Robotic Manipulation Policies"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages