Skip to content
/ Venus Public

Instruction Tuning for Code Efficiency Improvement

License

Notifications You must be signed in to change notification settings

Elfsong/Venus

Repository files navigation

Venus


arXiv HuggingFace HuggingFace

  • 🎉 What is Venus? Venus is the dataset used to train Afterburner (WIP). It is an extension of the original Mercury dataset and currently includes 6 languages: Python3, C++, Javascript, Go, Rust, and Java.
  • 🚧 What is the current progress? We are in the process of expanding the dataset to include more programming languages.
  • 🔮 Why Venus stands out? A key contribution of Venus is that it provides runtime and memory distributions containing multiple solutions for each problem—significantly more than existing datasets. It can be potentially used in Reinforcement Learning or Instruction Tuning.
  • 🌠 Acknowledgement: Please consider upvoting and citing our work if you find it useful. If you have any questions or issues with the dataset, feel free to email me at mingzhe@nus.edu.sg. Thank you! 😀

🪐 Venus Dataset <- TLDR: here is the dataset:)

🔍 Venus Annotation System

Please consider citing our paper if you think the resource is useful. Thank you!

@article{du2025afterburner,
  title={Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization},
  author={Du, Mingzhe and Luu, Anh Tuan and Liu, Yue and Qing, Yuhao and Huang, Dong and He, Xinyi and Liu, Qian and Ma, Zejun and Ng, See-kiong},
  booktitle={https://arxiv.org/abs/2505.23387},
  year={2025}
}

About

Instruction Tuning for Code Efficiency Improvement

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published