arXiv:1504.02089 [cs.LG]AbstractReferencesReviewsResources
The Computational Power of Optimization in Online Learning
Published 2015-04-08Version 1
We study the computational relationship between optimization, online learning, and learning in games. We propose an oracle-based model in which the online player is given access to an optimization oracle for the corresponding offline problem, and investigate the impact of such oracle on the computational complexity of the online problem. First, we consider the fundamental setting of prediction with the advice of $N$ experts, augmented with an optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give an algorithm that attains vanishing regret in total runtime of $\widetilde{O}(\sqrt{N})$. We also give a lower bound showing that up to logarithmic factors this running time cannot be improved in the optimization oracle model. These results attest that an optimization oracle gives rise to a quadratic speedup as compared to the standard oracle-free setting, in which the required time for vanishing regret is $\widetilde{\Theta}(N)$. We then consider the closely related problem of learning in repeated $N \times N$ zero-sum games, in a setting where the players have access to oracles that compute the best-response to any mixed strategy of their opponent in constant time. We give an efficient algorithm in this model that converges to the value of the game in total time $\widetilde{O}(\sqrt{N})$, yielding again a quadratic improvement over the state of the art in the oracle-free setting, in which $\widetilde{\Theta}(N)$ is known to be tight (Grigoriadis and Khachiyan, 1995). We then show that this is the best possible, thereby obtaining a tight characterization of the computational power of best-response oracles in zero-sum games.