{
  "id": "2210.03885",
  "version": "v1",
  "published": "2022-10-08T02:28:10.000Z",
  "updated": "2022-10-08T02:28:10.000Z",
  "title": "Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts",
  "authors": [
    "Tao Zhong",
    "Zhixiang Chi",
    "Li Gu",
    "Yang Wang",
    "Yuanhao Yu",
    "Jin Tang"
  ],
  "comment": "Accepted at NeurIPS2022",
  "categories": [
    "cs.LG",
    "cs.CV"
  ],
  "abstract": "In this paper, we tackle the problem of domain shift. Most existing methods perform training on multiple source domains using a single model, and the same trained model is used on all unseen target domains. Such solutions are sub-optimal as each target domain exhibits its own speciality, which is not adapted. Furthermore, expecting the single-model training to learn extensive knowledge from the multiple source domains is counterintuitive. The model is more biased toward learning only domain-invariant features and may result in negative knowledge transfer. In this work, we propose a novel framework for unsupervised test-time adaptation, which is formulated as a knowledge distillation process to address domain shift. Specifically, we incorporate Mixture-of-Experts (MoE) as teachers, where each expert is separately trained on different source domains to maximize their speciality. Given a test-time target domain, a small set of unlabeled data is sampled to query the knowledge from MoE. As the source domains are correlated to the target domains, a transformer-based aggregator then combines the domain knowledge by examining the interconnection among them. The output is treated as a supervision signal to adapt a student prediction network toward the target domain. We further employ meta-learning to enforce the aggregator to distill positive knowledge and the student network to achieve fast adaptation. Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art and validates the effectiveness of each proposed component. Our code is available at https://github.com/n3il666/Meta-DMoE.",
  "revisions": [
    {
      "version": "v1",
      "updated": "2022-10-08T02:28:10.000Z"
    }
  ],
  "analyses": {
    "keywords": [
      "multiple source domains",
      "mixture-of-experts",
      "meta-distillation",
      "student prediction network",
      "unseen target domains"
    ],
    "note": {
      "typesetting": "TeX",
      "pages": 0,
      "language": "en",
      "license": "arXiv",
      "status": "editable"
    }
  }
}