arXiv Analytics

Sign in

arXiv:2406.00345 [cs.CV]AbstractReferencesReviewsResources

DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection

Zhi Zhou, Ming Yang, Jiang-Xin Shi, Lan-Zhe Guo, Yu-Feng Li

Published 2024-06-01Version 1

Vision-language models (VLMs), such as CLIP, have demonstrated impressive zero-shot capabilities for various downstream tasks. Their performance can be further enhanced through few-shot prompt tuning methods. However, current studies evaluate the performance of learned prompts separately on base and new classes. This evaluation lacks practicality for real-world applications since downstream tasks cannot determine whether the data belongs to base or new classes in advance. In this paper, we explore a problem setting called Open-world Prompt Tuning (OPT), which involves tuning prompts on base classes and evaluating on a combination of base and new classes. By introducing Decomposed Prompt Tuning framework (DePT), we theoretically demonstrate that OPT can be solved by incorporating out-of-distribution detection into prompt tuning, thereby enhancing the base-to-new discriminability. Based on DePT, we present a novel prompt tuning approach, namely, Decomposed Context Optimization (DeCoOp), which introduces new-class detectors and sub-classifiers to further enhance the base-class and new-class discriminability. Experimental results on 11 benchmark datasets validate the effectiveness of DePT and demonstrate that DeCoOp outperforms current state-of-the-art methods, providing a significant 2% average accuracy improvement.

Comments: Accepted by ICML 2024. Code is available at: https://wnjxyk.github.io/DeCoOp
Categories: cs.CV, cs.LG
Related articles: Most relevant | Search more
arXiv:2204.03934 [cs.CV] (Published 2022-04-08)
Does Robustness on ImageNet Transfer to Downstream Tasks?
arXiv:2109.01134 [cs.CV] (Published 2021-09-02)
Learning to Prompt for Vision-Language Models
arXiv:2301.04101 [cs.CV] (Published 2023-01-10)
Neural Radiance Field Codebooks