arXiv:2002.06723 Abstract | arXiv Analytics

arXiv:2002.06723 [cs.LG]Abstract References Reviews Resources

Reward Design for Driver Repositioning Using Multi-Agent Reinforcement Learning

Published 2020-02-17Version 1

A large portion of the passenger requests is reportedly unserviced, partially due to vacant for-hire drivers' cruising behavior during the passenger seeking process. This paper aims to model the multi-driver repositioning task through a mean field multi-agent reinforcement learning (MARL) approach. Noticing that the direct application of MARL to the multi-driver system under a given reward mechanism will very likely yield a suboptimal equilibrium due to the selfishness of drivers, this study proposes a reward design scheme with which a more desired equilibrium can be reached. To effectively solve the bilevel optimization problem with upper level as the reward design and the lower level as a multi-agent system (MAS), a Bayesian optimization algorithm is adopted to speed up the learning process. We then use a synthetic dataset to test the proposed model. The results show that the weighted average of order response rate and overall service charge can be improved by 4% using a simple platform service charge, compared with that of no reward design.

Categories: cs.LG, stat.ML

Keywords: reward design, driver repositioning, simple platform service charge, bayesian optimization algorithm, bilevel optimization problem

Related articles: Most relevant | Search more

arXiv:2210.01063 [cs.LG] (Published 2022-10-03)

On Stability and Generalization of Bilevel Optimization Problem

Meng Ding, Mingxi Lei, Yunwen Lei, Di Wang, Jinhui Xu

arXiv:2311.07025 [cs.LG] (Published 2023-11-13)

Embarassingly Simple Dataset Distillation

Feng Yunzhen, Vedantam Ramakrishna, Kempe Julia

arXiv:2211.04088 [cs.LG] (Published 2022-11-08)

A Penalty Based Method for Communication-Efficient Decentralized Bilevel Programming

Parvin Nazari, Ahmad Mousavi, Davoud Ataee Tarzanagh, George Michailidis