arXiv Analytics

Sign in

arXiv:2007.03960 [math.OC]AbstractReferencesReviewsResources

On Entropic Optimization and Path Integral Control

Tom Lefebvre, Guillaume Crevecoeur

Published 2020-07-08Version 1

This article is motivated by the question whether it is possible to solve optimal control (OC) or dynamic optimization problems in a similar fashion to how static optimization problems can be addressed with Evolutionary Strategies (ES). The latter maintain a sequence of Gaussian search distributions that converge to the optimum. For the moment, this question has been answered partially by a set of algorithms that are known as Path Integral Control (PIC). Those maintain a sequence of locally linear Gaussian feedback controllers. So far PIC methods have been derived solely from the theory of Linearly Solvable OC, which includes only a narrow subset of optimal control problems and has only limited application potential as a consequence. We aim to address this question within a more general mathematical setting. Therefore, we first identify the framework of entropic inference as a suitable setting to synthesise stochastic search algorithms. Therewith we establish the formal framework of entropic optimization and provide a compelling justification for the inclusion of entropy measures in stochastic optimization. From this theory follows a formal optimal search distribution sequence which converges monotonically to the Dirac delta distribution centred at the optimum. Then we demonstrate how this result can be used to derive Gaussian search distributions similar to existing ES. We then proceed to transfer these ideas from a static to a dynamic setting, therewith establishing the framework of Entropic OC which shares characteristics with entropy based Reinforcement Learning. From this theory we can construct a number of formal optimal path distribution sequences. Thence we derive the outlines of a generalised algorithmic framework complementing the existing PIC class. Our main ambition is to reveal how all of these fields are related in a most exciting fashion.

Related articles:
arXiv:1406.4026 [math.OC] (Published 2014-06-16, updated 2015-02-11)
Path Integral Control and State Dependent Feedback
arXiv:1309.3615 [math.OC] (Published 2013-09-14, updated 2014-09-17)
Implicit sampling for path integral control, Monte Carlo localization, and SLAM
arXiv:2308.11546 [math.OC] (Published 2023-08-22)
Risk-Minimizing Two-Player Zero-Sum Stochastic Differential Game via Path Integral Control