arXiv:2402.12451 [cs.CV]AbstractReferencesReviewsResources Classifications Subjects Themes Keywords multimodal large language models, revolution, visual modalities plays, significant research efforts, multimodal alignment strategies Tags Journal Information Publisher Journal Year Month Volume Number Pages DOI URL Miscellaneous Typesetting Pages Language License Submit Reset