arXiv Analytics

Sign in

arXiv:2101.08521 [cs.LG]AbstractReferencesReviewsResources

Out-of-Distribution Generalization Analysis via Influence Function

Haotian Ye, Chuanlong Xie, Yue Liu, Zhenguo Li

Published 2021-01-21Version 1

The mismatch between training and target data is one major challenge for current machine learning systems. When training data is collected from multiple domains and the target domains include all training domains and other new domains, we are facing an Out-of-Distribution (OOD) generalization problem that aims to find a model with the best OOD accuracy. One of the definitions of OOD accuracy is worst-domain accuracy. In general, the set of target domains is unknown, and the worst over target domains may be unseen when the number of observed domains is limited. In this paper, we show that the worst accuracy over the observed domains may dramatically fail to identify the OOD accuracy. To this end, we introduce Influence Function, a classical tool from robust statistics, into the OOD generalization problem and suggest the variance of influence function to monitor the stability of a model on training domains. We show that the accuracy on test domains and the proposed index together can help us discern whether OOD algorithms are needed and whether a model achieves good OOD generalization.

Related articles: Most relevant | Search more
arXiv:2411.15292 [cs.LG] (Published 2024-11-22)
Influence functions and regularity tangents for efficient active learning
arXiv:2108.13624 [cs.LG] (Published 2021-08-31)
Towards Out-Of-Distribution Generalization: A Survey
arXiv:2210.07441 [cs.LG] (Published 2022-10-14)
Characterizing the Influence of Graph Elements