arXiv Analytics

Sign in

arXiv:1802.07668 [math.OC]AbstractReferencesReviewsResources

A model for system uncertainty in reinforcement learning

Ryan Murray, Michele Palladino

Published 2018-02-21Version 1

This work provides a rigorous framework for studying continuous time control problems in uncertain environments. The framework considered models uncertainty in state dynamics as a measure on the space of functions. This measure is considered to change over time as agents learn their environment. This model can be seem as a variant of either Bayesian reinforcement learning or adaptive control. We study necessary conditions for locally optimal trajectories within this model, in particular deriving an appropriate dynamic programming principle and Hamilton-Jacobi equations. This model provides one possible framework for studying the tradeoff between exploration and exploitation in reinforcement learning.

Related articles: Most relevant | Search more
arXiv:2502.04788 [math.OC] (Published 2025-02-07)
A non-zero-sum game with reinforcement learning under mean-variance framework
arXiv:2402.08306 [math.OC] (Published 2024-02-13, updated 2024-07-29)
Reinforcement Learning for Docking Maneuvers with Prescribed Performance
arXiv:1906.11392 [math.OC] (Published 2019-06-27)
From self-tuning regulators to reinforcement learning and back again