{ "id": "2211.15457", "version": "v2", "published": "2022-11-28T15:48:35.000Z", "updated": "2023-01-02T20:14:02.000Z", "title": "Hypernetworks for Zero-shot Transfer in Reinforcement Learning", "authors": [ "Sahand Rezaei-Shoshtari", "Charlotte Morissette", "Francois Robert Hogan", "Gregory Dudek", "David Meger" ], "comment": "AAAI 2023", "categories": [ "cs.LG" ], "abstract": "In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.", "revisions": [ { "version": "v2", "updated": "2023-01-02T20:14:02.000Z" } ], "analyses": { "keywords": [ "zero-shot transfer", "reinforcement learning", "hypernetwork", "method demonstrates significant improvements", "generate near-optimal value functions" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }