CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection

Introduction

This is an official release of the paper CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection.

CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection,
Hojun Choi, Youngsun Lim, Jaeyo Shin, Hyunjung Shim

[Paper][project page(TBD)][Bibetex]

Updates

⛽⛽⛽ Contact: eric970412@gmail.com

[✅] [2024.12.31] 👨‍💻 The official codes have been released!

[✅] [2024.10.16] 📄 Our paper is now available! You can find the paper here.

Installation

This project is based on MMDetection 3.x

It requires the following OpenMMLab packages:

MMEngine >= 0.6.0
MMCV-full >= v2.0.0rc4
MMDetection >= v3.0.0rc6
lvisapi

pip install openmim mmengine
mim install "mmcv>=2.0.0rc4"
pip install git+https://github.com/lvis-dataset/lvis-api.git
mim install "mmdet>=3.0.0rc6"
pip install ftfy regex

License

This project is released under the NTU S-Lab License 1.0.

Usage

Obtain CLIP Checkpoints

We use CLIP's ViT-B-16 model for the implementation of our method. pip install git+https://github.com/openai/CLIP.git and run

import clip
import torch
model, _ = clip.load("ViT-B/16")
torch.save(model.state_dict(), 'checkpoints/clip_vitb16.pth')

Pseudo-Label Generation

The pseudo-label generation is on pseudo-label or download instances_train2017_pseudo_v0_new.json from huggingface.

Training and Testing

The training and testing on OV-COCO are supported now.

Citation

@misc{choi2025cotplvisualchainofthoughtreasoning,
      title={CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection}, 
      author={Hojun Choi and Youngsun Lim and Jaeyo Shin and Hyunjung Shim},
      year={2025},
      eprint={2510.14792},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.14792}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
data/metadata		data/metadata
ovdet		ovdet
pseudo-labels		pseudo-labels
reprod		reprod
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection

Introduction

Updates

Installation

License

Usage

Obtain CLIP Checkpoints

Pseudo-Label Generation

Training and Testing

Citation

About

Uh oh!

Releases

Packages

Languages

License

hchoi256/cotpl

Folders and files

Latest commit

History

Repository files navigation

CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection

Introduction

Updates

Installation

License

Usage

Obtain CLIP Checkpoints

Pseudo-Label Generation

Training and Testing

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages