arXiv:2305.18425 Abstract | arXiv Analytics

arXiv:2305.18425 [cs.LG]Abstract References Reviews Resources

Efficient Storage of Fine-Tuned Models via Low-Rank Approximation of Weight Residuals

Published 2023-05-28Version 1

In this paper, we present an efficient method for storing fine-tuned models by leveraging the low-rank properties of weight residuals. Our key observation is that weight residuals in large overparameterized models exhibit even stronger low-rank characteristics. Based on this insight, we propose Efficient Residual Encoding (ERE), a novel approach that achieves efficient storage of fine-tuned model weights by approximating the low-rank weight residuals. Furthermore, we analyze the robustness of weight residuals and push the limit of storage efficiency by utilizing additional quantization and layer-wise rank allocation. Our experimental results demonstrate that our method significantly reduces memory footprint while preserving performance in various tasks and modalities. We release our code.

Comments: 16 pages, 8 figures

Categories: cs.LG, cs.AI

Subjects: I.2.6

Keywords: fine-tuned model, low-rank approximation, method significantly reduces memory footprint, low-rank weight residuals, stronger low-rank characteristics

Related articles: Most relevant | Search more

arXiv:2006.10653 [cs.LG] (Published 2020-06-18)

Precise expressions for random projections: Low-rank approximation and randomized Newton

Michał Dereziński, Feynman Liang, Zhenyu Liao, Michael W. Mahoney

arXiv:2410.09615 [cs.LG] (Published 2024-10-12, updated 2025-02-04)

SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression

Mohammad Mozaffari, Amir Yazdanbakhsh, Maryam Mehri Dehnavi

arXiv:2409.16223 [cs.LG] (Published 2024-09-24)

Fine-Tuning is Fine, if Calibrated

Zheda Mai et al.

arXiv Analytics

arXiv:2305.18425 [cs.LG]Abstract References Reviews Resources

Efficient Storage of Fine-Tuned Models via Low-Rank Approximation of Weight Residuals

Links

Toolbox

arXiv:2305.18425 [cs.LG]AbstractReferencesReviewsResources

Efficient Storage of Fine-Tuned Models via Low-Rank Approximation of Weight Residuals

Links

Toolbox

arXiv:2305.18425 [cs.LG]Abstract References Reviews Resources