arXiv:2006.14606 Abstract | arXiv Analytics

arXiv:2006.14606 [cs.LG]Abstract References Reviews Resources

Global Convergence and Induced Kernels of Gradient-Based Meta-Learning with Neural Nets

Published 2020-06-25Version 1

Gradient-based meta-learning (GBML) with deep neural nets (DNNs) has become a popular approach for few-shot learning. However, due to the non-convexity of DNNs and the complex bi-level optimization in GBML, the theoretical properties of GBML with DNNs remain largely unknown. In this paper, we first develop a novel theoretical analysis to answer the following questions: Does GBML with DNNs have global convergence guarantees? We provide a positive answer to this question by proving that GBML with over-parameterized DNNs is guaranteed to converge to global optima at a linear rate. The second question we aim to address is: How does GBML achieve fast adaption to new tasks with experience on past similar tasks? To answer it, we prove that GBML is equivalent to a functional gradient descent operation that explicitly propagates experience from the past tasks to new ones. Finally, inspired by our theoretical analysis, we develop a new kernel-based meta-learning approach. We show that the proposed approach outperforms GBML with standard DNNs on the Omniglot dataset when the number of past tasks for meta-training is small. The code is available at https://github.com/ AI-secure/Meta-Neural-Kernel .

Comments: Under review

Categories: cs.LG, stat.ML

Keywords: neural nets, induced kernels, gradient-based meta-learning, gbml achieve fast adaption, past tasks

Related articles: Most relevant | Search more

arXiv:1806.06850 [cs.LG] (Published 2018-06-13)

Polynomial Regression As an Alternative to Neural Nets

Xi Cheng, Bohdan Khomtchouk, Norman Matloff, Pete Mohanty

arXiv:2401.01869 [cs.LG] (Published 2024-01-03)

On the hardness of learning under symmetries

Bobak T. Kiani, Thien Le, Hannah Lawrence, Stefanie Jegelka, Melanie Weber

arXiv:2207.03804 [cs.LG] (Published 2022-07-08)

On the Subspace Structure of Gradient-Based Meta-Learning