arXiv:2108.03952 Abstract | arXiv Analytics

arXiv:2108.03952 [cs.LG]Abstract References Reviews Resources

Safe Deep Reinforcement Learning for Multi-Agent Systems with Continuous Action Spaces

Ziyad Sheebaelhamd, Konstantinos Zisis, Athina Nisioti, Dimitris Gkouletsos, Dario Pavllo, Jonas Kohler

Published 2021-08-09Version 1

Multi-agent control problems constitute an interesting area of application for deep reinforcement learning models with continuous action spaces. Such real-world applications, however, typically come with critical safety constraints that must not be violated. In order to ensure safety, we enhance the well-known multi-agent deep deterministic policy gradient (MADDPG) framework by adding a safety layer to the deep policy network. %which automatically corrects invalid actions. In particular, we extend the idea of linearizing the single-step transition dynamics, as was done for single-agent systems in Safe DDPG (Dalal et al., 2018), to multi-agent settings. We additionally propose to circumvent infeasibility problems in the action correction step using soft constraints (Kerrigan & Maciejowski, 2000). Results from the theory of exact penalty functions can be used to guarantee constraint satisfaction of the soft constraints under mild assumptions. We empirically find that the soft formulation achieves a dramatic decrease in constraint violations, making safety available even during the learning procedure.

Comments: ICML 2021 Workshop on Reinforcement Learning for Real Life

Categories: cs.LG, cs.RO

Keywords: safe deep reinforcement learning, continuous action spaces, multi-agent systems, multi-agent deep deterministic policy, deep deterministic policy gradient

Related articles: Most relevant | Search more

arXiv:1911.06833 [cs.LG] (Published 2019-11-15)

Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient

Kevin Sebastian Luck, Mel Vecerik, Simon Stepputtis, Heni Ben Amor, Jonathan Scholz

arXiv:1810.09103 [cs.LG] (Published 2018-10-22)

Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces

Sungsu Lim, Ajin Joseph, Lei Le, Yangchen Pan, Martha White

arXiv:1902.04118 [cs.LG] (Published 2019-02-11)

WiseMove: A Framework for Safe Deep Reinforcement Learning for Autonomous Driving

Jaeyoung Lee, Aravind Balakrishnan, Ashish Gaurav, Krzysztof Czarnecki, Sean Sedwards

arXiv Analytics

arXiv:2108.03952 [cs.LG]Abstract References Reviews Resources

Safe Deep Reinforcement Learning for Multi-Agent Systems with Continuous Action Spaces

Links

Toolbox

arXiv:2108.03952 [cs.LG]AbstractReferencesReviewsResources

Safe Deep Reinforcement Learning for Multi-Agent Systems with Continuous Action Spaces

Links

Toolbox

arXiv:2108.03952 [cs.LG]Abstract References Reviews Resources