arXiv:2503.07851 Abstract | arXiv Analytics

arXiv:2503.07851 [cs.LG]Abstract References Reviews Resources

TwinTURBO: Semi-Supervised Fine-Tuning of Foundation Models via Mutual Information Decompositions for Downstream Task and Latent Spaces

Guillaume Quétant, Pavlo Molchanov, Slava Voloshynovskiy

Published 2025-03-10, updated 2025-05-16Version 2

We present a semi-supervised fine-tuning framework for foundation models that utilises mutual information decomposition to address the challenges of training for a limited amount of labelled data. Our approach derives two distinct lower bounds: i) for the downstream task space, such as classification, optimised using conditional and marginal cross-entropy alongside Kullback-Leibler divergence, and ii) for the latent space representation, regularised and aligned using a contrastive-like decomposition. This fine-tuning strategy retains the pre-trained structure of the foundation model, modifying only a specialised projector module comprising a small transformer and a token aggregation technique. Experiments on several datasets demonstrate significant improvements in classification tasks under extremely low-labelled conditions by effectively leveraging unlabelled data.

Categories: cs.LG, cs.CV, cs.IT, math.IT, stat.ML

Keywords: foundation model, downstream task, latent space, semi-supervised fine-tuning, marginal cross-entropy alongside kullback-leibler divergence

Related articles: Most relevant | Search more

arXiv:2501.15955 [cs.LG] (Published 2025-01-27)

Rethinking the Bias of Foundation Model under Long-tailed Distribution

Jiahao Chen, Bin Qin, Jiangmeng Li, Hao Chen, Bing Su

arXiv:2206.03826 [cs.LG] (Published 2022-06-08)

Towards Understanding Why Mask-Reconstruction Pretraining Helps in Downstream Tasks

Jiachun Pan, Pan Zhou, Shuicheng Yan

arXiv:2502.05505 [cs.LG] (Published 2025-02-08)

Differentially Private Synthetic Data via APIs 3: Using Simulators Instead of Foundation Model

Zinan Lin, Tadas Baltrusaitis, Sergey Yekhanin