arXiv:2305.13035 Abstract | arXiv Analytics

arXiv:2305.13035 [cs.CV]Abstract References Reviews Resources

Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

Ibrahim Alabdulmohsin, Xiaohua Zhai, Alexander Kolesnikov, Lucas Beyer

Published 2023-05-22Version 1

Scaling laws have been recently employed to derive compute-optimal model size (number of parameters) for a given compute duration. We advance and refine such methods to infer compute-optimal model shapes, such as width and depth, and successfully implement this in vision transformers. Our shape-optimized vision transformer, SoViT, achieves results competitive with models that exceed twice its size, despite being pre-trained with an equivalent amount of compute. For example, SoViT-400m/14 achieves 90.3% fine-tuning accuracy on ILSRCV2012, surpassing the much larger ViT-g/14 and approaching ViT-G/14 under identical settings, with also less than half the inference cost. We conduct a thorough evaluation across multiple tasks, such as image classification, captioning, VQA and zero-shot transfer, demonstrating the effectiveness of our model across a broad range of domains and identifying limitations. Overall, our findings challenge the prevailing approach of blindly scaling up vision models and pave a path for a more informed scaling.

Comments: 10 pages, 7 figures, 9 tables

Categories: cs.CV, cs.LG

Subjects: I.2.6

Keywords: compute-optimal model design, scaling laws, getting vit, infer compute-optimal model shapes, derive compute-optimal model

Related articles:

arXiv:2312.04567 [cs.CV] (Published 2023-12-07)

Scaling Laws of Synthetic Images for Model Training ... for Now

Lijie Fan, Kaifeng Chen, Dilip Krishnan, Dina Katabi, Phillip Isola, Yonglong Tian

arXiv:2404.02973 [cs.CV] (Published 2024-04-03)

Scaling Laws for Galaxy Images

Mike Walmsley et al.

arXiv Analytics

arXiv:2305.13035 [cs.CV]Abstract References Reviews Resources

Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

Links

Toolbox

arXiv:2305.13035 [cs.CV]AbstractReferencesReviewsResources

Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

Links

Toolbox

arXiv:2305.13035 [cs.CV]Abstract References Reviews Resources