arXiv Analytics

Sign in

arXiv:2503.07851 [cs.LG]AbstractReferencesReviewsResources

TwinTURBO: Semi-Supervised Fine-Tuning of Foundation Models via Mutual Information Decompositions for Downstream Task and Latent Spaces

Guillaume Quétant, Pavlo Molchanov, Slava Voloshynovskiy

Published 2025-03-10, updated 2025-05-16Version 2

We present a semi-supervised fine-tuning framework for foundation models that utilises mutual information decomposition to address the challenges of training for a limited amount of labelled data. Our approach derives two distinct lower bounds: i) for the downstream task space, such as classification, optimised using conditional and marginal cross-entropy alongside Kullback-Leibler divergence, and ii) for the latent space representation, regularised and aligned using a contrastive-like decomposition. This fine-tuning strategy retains the pre-trained structure of the foundation model, modifying only a specialised projector module comprising a small transformer and a token aggregation technique. Experiments on several datasets demonstrate significant improvements in classification tasks under extremely low-labelled conditions by effectively leveraging unlabelled data.

Related articles: Most relevant | Search more
arXiv:2501.15955 [cs.LG] (Published 2025-01-27)
Rethinking the Bias of Foundation Model under Long-tailed Distribution
arXiv:2206.03826 [cs.LG] (Published 2022-06-08)
Towards Understanding Why Mask-Reconstruction Pretraining Helps in Downstream Tasks
arXiv:2502.05505 [cs.LG] (Published 2025-02-08)
Differentially Private Synthetic Data via APIs 3: Using Simulators Instead of Foundation Model