arXiv:1802.07668 [math.OC]AbstractReferencesReviewsResources
A model for system uncertainty in reinforcement learning
Ryan Murray, Michele Palladino
Published 2018-02-21Version 1
This work provides a rigorous framework for studying continuous time control problems in uncertain environments. The framework considered models uncertainty in state dynamics as a measure on the space of functions. This measure is considered to change over time as agents learn their environment. This model can be seem as a variant of either Bayesian reinforcement learning or adaptive control. We study necessary conditions for locally optimal trajectories within this model, in particular deriving an appropriate dynamic programming principle and Hamilton-Jacobi equations. This model provides one possible framework for studying the tradeoff between exploration and exploitation in reinforcement learning.