arXiv Analytics

Sign in

arXiv:1911.11756 [cs.LG]AbstractReferencesReviewsResources

Semi-Supervised Learning for Text Classification by Layer Partitioning

Alexander Hanbo Li, Abhinav Sethy

Published 2019-11-26Version 1

Most recent neural semi-supervised learning algorithms rely on adding small perturbation to either the input vectors or their representations. These methods have been successful on computer vision tasks as the images form a continuous manifold, but are not appropriate for discrete input such as sentence. To adapt these methods to text input, we propose to decompose a neural network $M$ into two components $F$ and $U$ so that $M = U\circ F$. The layers in $F$ are then frozen and only the layers in $U$ will be updated during most time of the training. In this way, $F$ serves as a feature extractor that maps the input to high-level representation and adds systematical noise using dropout. We can then train $U$ using any state-of-the-art SSL algorithms such as $\Pi$-model, temporal ensembling, mean teacher, etc. Furthermore, this gradually unfreezing schedule also prevents a pretrained model from catastrophic forgetting. The experimental results demonstrate that our approach provides improvements when compared to state of the art methods especially on short texts.

Related articles: Most relevant | Search more
arXiv:1907.06065 [cs.LG] (Published 2019-07-13)
Bringing Giant Neural Networks Down to Earth with Unlabeled Data
arXiv:2204.10810 [cs.LG] (Published 2022-04-22)
Learning to Scaffold: Optimizing Model Explanations for Teaching
arXiv:1905.12916 [cs.LG] (Published 2019-05-30)
Effective Medical Test Suggestions Using Deep Reinforcement Learning