{ "id": "1901.09895", "version": "v1", "published": "2019-01-27T05:06:30.000Z", "updated": "2019-01-27T05:06:30.000Z", "title": "Modularization of End-to-End Learning: Case Study in Arcade Games", "authors": [ "Andrew Melnik", "Sascha Fleer", "Malte Schilling", "Helge Ritter" ], "categories": [ "cs.LG", "cs.AI", "stat.ML" ], "abstract": "Complex environments and tasks pose a difficult problem for holistic end-to-end learning approaches. Decomposition of an environment into interacting controllable and non-controllable objects allows supervised learning for non-controllable objects and universal value function approximator learning for controllable objects. Such decomposition should lead to a shorter learning time and better generalisation capability. Here, we consider arcade-game environments as sets of interacting objects (controllable, non-controllable) and propose a set of functional modules that are specialized on mastering different types of interactions in a broad range of environments. The modules utilize regression, supervised learning, and reinforcement learning algorithms. Results of this case study in different Atari games suggest that human-level performance can be achieved by a learning agent within a human amount of game experience (10-15 minutes game time) when a proper decomposition of an environment or a task is provided. However, automatization of such decomposition remains a challenging problem. This case study shows how a model of a causal structure underlying an environment or a task can benefit learning time and generalization capability of the agent, and argues in favor of exploiting modular structure in contrast to using pure end-to-end learning approaches.", "revisions": [ { "version": "v1", "updated": "2019-01-27T05:06:30.000Z" } ], "analyses": { "keywords": [ "case study", "arcade games", "value function approximator learning", "environment", "end-to-end learning approaches" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }