ContextBLIP : Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions

The code of the paper《ContextBLIP : Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions》(ACL 2024)

Set up

conda create -n contextblip python=3.9
conda activate contextblip
pip install -r requirements.txt

Data Preparation

COCO, VG, IMAGECODE

annotations.zip

bert-base-uncased.zip

model_base.pth

Experiments

Pretrain

unzip bert-base-uncased.zip、annotations.zip in ./
Modify the train_file field in the pretraining configuration file ./configs/pretrain.yaml to the list containing the paths where coco.json and vg.json reside
The coco and vg images are stored in the./pretrain_data/vl_pair folder, like below:
run code

unzip bert-base-uncased.zip
unzip annotations.zip
bash run.sh

Finetune

download imagecode dataset

images：image-sets.zip · BennoKrojer/ImageCoDe at main (huggingface.co)

annotations：imagecode/data at main · McGill-NLP/imagecode (github.com)
```
mkdir data
mv image-sets.zip dataset/
mv train_data.json dataset/
mv valid_data.json dataset/
mv test_data_unlabeled.json dataset/
cd dataset
unzip image-sets.zip
```
Check that the image path is./dataset/image-sets and the marked path is./dataset, as shown in the following figure:

run code

nohup python -u finetune.py --finetuned_checkpoint_path {pretrained model path} > finetune.log 2>&1 & #开始训练

Zero-Shot on IMAGECODE

python zero-shot_new.py --finetuned_checkpoint_path {pretrained model path}

Task-specific Analysis

python analysis/analysis_finetune.py --finetuned_checkpoint_path {finetuned model path} #评估finetune模型

MMVP-VLM Benchmark

you need to replace the finetune model path in line 58.

python evaluate_vlm_contextblip.py

Comparison with GPT4

you need to replace the API Key in line 58.

python sample.py #sample the subsets, random seed can be changed in the file
# datapath need to be modified in the file
# GPT4 API (You Need GPT4-vision API KEY)
python GPT4v.py
# ContextBLIP
python analysis/gpt4_comparison.py

Ablation Study

In the ablation experiment, the image mask rate was adjusted by adjusting the command line parameters

#img_mask_rate
nohup python -u -m torch.distributed.run --nproc_per_node 4 main.py --mask_rate ${img_mask_rate} --output_dir 'output/Pretrain/'$img_mask_rate'' > pretrain.log 2>&1 &

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
analysis		analysis
configs		configs
dataset		dataset
models		models
optim		optim
scheduler		scheduler
test		test
transform		transform
volta_src		volta_src
1.png		1.png
2.png		2.png
GPT4v.py		GPT4v.py
README.md		README.md
albef_finetune.py		albef_finetune.py
albef_zero-shot.py		albef_zero-shot.py
blip_finetune.py		blip_finetune.py
blip_zero-shot.py		blip_zero-shot.py
evaluate_vlm_blip.py		evaluate_vlm_blip.py
evaluate_vlm_blip2.py		evaluate_vlm_blip2.py
evaluate_vlm_contextblip.py		evaluate_vlm_contextblip.py
extras.py		extras.py
finetune_context.py		finetune_context.py
finetune_notf.py		finetune_notf.py
main.py		main.py
mask.py		mask.py
requirements.txt		requirements.txt
run.sh		run.sh
sample.py		sample.py
test_blip2.py		test_blip2.py
utils.py		utils.py
vilbert-and-bert-config.json		vilbert-and-bert-config.json
zero-shot_blip2.py		zero-shot_blip2.py
zero-shot_new.py		zero-shot_new.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ContextBLIP : Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions

Set up

Data Preparation

Experiments

Pretrain

Finetune

Zero-Shot on IMAGECODE

Task-specific Analysis

MMVP-VLM Benchmark

Comparison with GPT4

Ablation Study

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

LHL3341/ContextBLIP

Folders and files

Latest commit

History

Repository files navigation

ContextBLIP : Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions

Set up

Data Preparation

Experiments

Pretrain

Finetune

Zero-Shot on IMAGECODE

Task-specific Analysis

MMVP-VLM Benchmark

Comparison with GPT4

Ablation Study

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages