arXiv:2505.01828 [math.OC]AbstractReferencesReviewsResources Classifications Subjects Themes Keywords rank-one modified value iteration, transition probability matrix, learning problem, algorithm consistently outperforms first-order algorithms, rank-one approximation Tags conference paper Journal Information Publisher Journal Year Month Volume Number Pages DOI URL Miscellaneous Typesetting Pages Language License Submit Reset