Skip to content

Forence1999/PRoLoRA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA

[📄 Paper][🐳 Docker][🗁 GitHub]

🔥 Official repo for "PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA".

❗️ Most of files are inherited from AllenAI's great work. We show our greatest respect to their efforts, and all the relevant rights are reserved for the ORIGINAL authors!

ProLoRA_img

🔥 News

  • [2024/05/16] 🔥🔥🔥 ProLoRA is accepted by ACL 2024 (main conference)!

💡 Abstract

With the rapid scaling of large language models (LLMs), serving numerous LoRAs concurrently has become increasingly impractical, leading to unaffordable costs and necessitating more parameter-efficient finetuning methods. In this work, we introduce Partially Rotationenhanced Low-Rank Adaptation (PRoLoRA), an intra-layer sharing mechanism comprising four essential components: broadcast reduction, rotation enhancement, partially-sharing refinement, and rectified initialization strategy. As a superset of LoRA, PRoLoRA pertains its advantages, and effectively circumvent the drawbacks of peer parameter-sharing methods with superior model capacity, practical feasibility, and broad applicability. Empirical experiments demonstrate the remarkably higher parameter efficiency of PRoLoRA in both specific parameter budget and performance target scenarios, and its scalability to larger LLMs. Notably, with one time less trainable parameters, PRoLoRA still outperforms LoRA on multiple instruction tuning datasets. Subsequently, an ablation study is conducted to validate the necessity of individual components and highlight the superiority of PRoLoRA over three potential variants. Hopefully, the conspicuously higher parameter efficiency can establish PRoLoRA as a resource-friendly alternative to LoRA.

⚙️ Environment setting

🗁 Prepare GitHub Repo

# Clone the repo to local machine
git clone https://github.com/Forence1999/open-instruct-1121.git
cd open-instruct-1121

🐳 Docker

We recommend to setup the environment with our docker image, which will prepare the whole environment and ease your reproduction with minimal effort.

# Pull the image from dockerhub
docker pull forence/open-instruct:v1

# Start the container, remember to replace <PROJECT_DIR> with your own project directory
docker run \
    --name prolora \
    --gpus all \
    --network=host \
    -v <PROJECT_DIR>:/workspace \
    -it forence/open-instruct:v1 /bin/bash

cd /workspace

🐍 Conda

If you use the above docker image, this step can be skipped, because the conda env has been well prepared in it.

# Create and activate conda environment
conda create -n prolora python=3.11
conda activate prolora

# Install required dependencies
pip install -r requirements.txt

📜 Datasets

The data preparation is inherited from the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources" and open-instruct github repo, which can be refered for deatiled information. For simplicity, you can download and process the datasets for both fine-tuning and evaluation with following scripts:

# Prepare the training data
./scripts/prepare_train_data.sh

# Prepare the evaluation data
./scripts/prepare_eval_data.sh

📃 Experiments

LLaMA series require addtional requests to download. For LLaMA2 models, please refer to Hugging Face documentation for LLaMA for requesting the access token.

There are two alternative methods to pass the access token:

  1. Pass as a parameter (Recommended)
# Set the <HF_TOKEN> in the shell script and pass it as:
--token ${HF_TOKEN}
  1. Pass through environment variable
python -c "from huggingface_hub.hf_api import HfFolder; HfFolder.save_token(<HF_TOKEN>)"

All the preparation work is done! Here's an example to fine-tune LLaMA2-7B with SuperNI and evaluation on MMLU. The running script is as follows:

# Before running the following script, please replace the <HF_TOKEN> with your own huggingface token
bash ft_llama2_7b_superni_mmlu.sh <LORA_RANK> <UNSHARED_RANK> <REDUCED_LORA_A_X> <REDUCED_LORA_B_X> <LEARNING_RATE> <SEED> <GPU_ID>

Here's a detailed description for each parameter:

  • LORA_RANK: The rank of PRoLoRA, refered as the variable r in our paper.
  • UNSHARED_RANK: Among all the ranks r in PRoLoRA, how many ranks are preserved to be unshared, refered as the variable u in our paper.
  • REDUCED_LORA_A_X / REDUCED_LORA_B_X: Multiples of PRoLoRA matrices A / B sharing, refered as the variables m / n in our paper, respectively.
  • LEARNING_RATE: Linear learning rate.
  • SEED: Random seed.
  • GPU_ID: The id of GPU assigned for the run.

We also provide commands to postprocess and summarize the results, the running script is as follows:

# For MMLU
python mmlu_summarize.py --ts <TIME_SPAN>

# For TydiQA
python mmlu_summarize.py --ts <TIME_SPAN>
  • TIME_SPAN: Duration of the present time from its last modification time in hours to be considered in result summary.

© Citation

If you find our wrok helpful, please kindly cite the paper as follows:

@article{wang2024prolora,
      title={PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA}, 
      author={Sheng Wang and Boyang Xue and Jiacheng Ye and Jiyue Jiang and Liheng Chen and Lingpeng Kong and Chuan Wu},
      year={2024},
      eprint={2402.16902},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2402.16902}, 
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •