arXiv:2405.14974 [cs.CV]AbstractReferencesReviewsResources Classifications Subjects Themes Keywords visual question answering, assessment, results demonstrate consistent performance improvements, current multimodal large language models Tags github project Journal Information Publisher Journal Year Month Volume Number Pages DOI URL Miscellaneous Typesetting Pages Language License Submit Reset