arXiv:2411.12951 [cs.CV]AbstractReferencesReviewsResources Classifications Subjects Themes Keywords video large language models, consistency, video content, temporal comprehension capabilities, retrieve video moments Tags Journal Information Publisher Journal Year Month Volume Number Pages DOI URL Miscellaneous Typesetting Pages Language License Submit Reset