arXiv:2310.08669 [cs.CV]AbstractReferencesReviewsResources Classifications Subjects Themes Keywords multimodal large language model, visual navigation, outperforms state-of-the-art behavior cloning methods, fine-tune large language models, text prompt Tags Journal Information Publisher Journal Year Month Volume Number Pages DOI URL Miscellaneous Typesetting Pages Language License Submit Reset