arXiv Analytics

Sign in

arXiv:2006.14606 [cs.LG]AbstractReferencesReviewsResources

Global Convergence and Induced Kernels of Gradient-Based Meta-Learning with Neural Nets

Haoxiang Wang, Ruoyu Sun, Bo Li

Published 2020-06-25Version 1

Gradient-based meta-learning (GBML) with deep neural nets (DNNs) has become a popular approach for few-shot learning. However, due to the non-convexity of DNNs and the complex bi-level optimization in GBML, the theoretical properties of GBML with DNNs remain largely unknown. In this paper, we first develop a novel theoretical analysis to answer the following questions: Does GBML with DNNs have global convergence guarantees? We provide a positive answer to this question by proving that GBML with over-parameterized DNNs is guaranteed to converge to global optima at a linear rate. The second question we aim to address is: How does GBML achieve fast adaption to new tasks with experience on past similar tasks? To answer it, we prove that GBML is equivalent to a functional gradient descent operation that explicitly propagates experience from the past tasks to new ones. Finally, inspired by our theoretical analysis, we develop a new kernel-based meta-learning approach. We show that the proposed approach outperforms GBML with standard DNNs on the Omniglot dataset when the number of past tasks for meta-training is small. The code is available at https://github.com/ AI-secure/Meta-Neural-Kernel .

Related articles: Most relevant | Search more
arXiv:1806.06850 [cs.LG] (Published 2018-06-13)
Polynomial Regression As an Alternative to Neural Nets
arXiv:2401.01869 [cs.LG] (Published 2024-01-03)
On the hardness of learning under symmetries
arXiv:2207.03804 [cs.LG] (Published 2022-07-08)
On the Subspace Structure of Gradient-Based Meta-Learning