arXiv:2303.05699 Abstract | arXiv Analytics

arXiv:2303.05699 [cs.CV]Abstract References Reviews Resources

Feature Unlearning for Pre-trained GANs and VAEs

Published 2023-03-10, updated 2023-08-24Version 2

We tackle the problem of feature unlearning from a pre-trained image generative model: GANs and VAEs. Unlike a common unlearning task where an unlearning target is a subset of the training set, we aim to unlearn a specific feature, such as hairstyle from facial images, from the pre-trained generative models. As the target feature is only presented in a local region of an image, unlearning the entire image from the pre-trained model may result in losing other details in the remaining region of the image. To specify which features to unlearn, we collect randomly generated images that contain the target features. We then identify a latent representation corresponding to the target feature and then use the representation to fine-tune the pre-trained model. Through experiments on MNIST and CelebA datasets, we show that target features are successfully removed while keeping the fidelity of the original models. Further experiments with an adversarial attack show that the unlearned model is more robust under the presence of malicious parties.

Categories: cs.CV, cs.LG

Keywords: target feature, feature unlearning, pre-trained gans, pre-trained model, original models

Related articles: Most relevant | Search more

arXiv:2103.01542 [cs.CV] (Published 2021-03-02)

TransTailor: Pruning the Pre-trained Model for Improved Transfer Learning

Bingyan Liu, Yifeng Cai, Yao Guo, Xiangqun Chen

arXiv:2212.02112 [cs.CV] (Published 2022-12-05)

Learning to Learn Better for Video Object Segmentation

Meng Lan, Jing Zhang, Lefei Zhang, Dacheng Tao

arXiv:2404.12292 [cs.CV] (Published 2024-04-18)

Reducing Bias in Pre-trained Models by Tuning while Penalizing Change

Niklas Penzel, Gideon Stein, Joachim Denzler