arXiv:2202.06450 Abstract | arXiv Analytics

arXiv:2202.06450 [cs.LG]Abstract References Reviews Resources

Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality

Jiawei Huang, Jinglin Chen, Li Zhao, Tao Qin, Nan Jiang, Tie-Yan Liu

Published 2022-02-14Version 1

Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL). Despite the community's increasing interest, there lacks a formal theoretical formulation for the problem. In this paper, we propose such a formulation for deployment-efficient RL (DE-RL) from an "optimization with constraints" perspective: we are interested in exploring an MDP and obtaining a near-optimal policy within minimal \emph{deployment complexity}, whereas in each deployment the policy can sample a large batch of data. Using finite-horizon linear MDPs as a concrete structural model, we reveal the fundamental limit in achieving deployment efficiency by establishing information-theoretic lower bounds, and provide algorithms that achieve the optimal deployment efficiency. Moreover, our formulation for DE-RL is flexible and can serve as a building block for other practically relevant settings; we give "Safe DE-RL" and "Sample-Efficient DE-RL" as two examples, which may be worth future investigation.

Comments: 49 Pages; ICLR 2022

Categories: cs.LG, cs.AI, stat.ML

Keywords: deployment-efficient reinforcement learning, optimality, optimal deployment efficiency, formulation, finite-horizon linear mdps

Related articles: Most relevant | Search more

arXiv:1907.05444 [cs.LG] (Published 2019-07-11)

On the Optimality of Trees Generated by ID3

Alon Brutzkus, Amit Daniely, Eran Malach

arXiv:2006.03647 [cs.LG] (Published 2020-06-05)

Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization

Tatsuya Matsushima, Hiroki Furuta, Yutaka Matsuo, Ofir Nachum, Shixiang Gu

arXiv:2202.11853 [cs.LG] (Published 2022-02-24)

Attainability and Optimality: The Equalized Odds Fairness Revisited

Zeyu Tang, Kun Zhang

arXiv Analytics

arXiv:2202.06450 [cs.LG]Abstract References Reviews Resources

Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality

Links

Toolbox

arXiv:2202.06450 [cs.LG]AbstractReferencesReviewsResources

Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality

Links

Toolbox

arXiv:2202.06450 [cs.LG]Abstract References Reviews Resources