arXiv Analytics

Sign in

arXiv:2112.14804 [cs.CV]AbstractReferencesReviewsResources

Learning Inception Attention for Image Synthesis and Image Recognition

Jianghao Shen, Tianfu Wu

Published 2021-12-29, updated 2022-02-04Version 2

Image synthesis and image recognition have witnessed remarkable progress, but often at the expense of computationally expensive training and inference. Learning lightweight yet expressive deep model has emerged as an important and interesting direction. Inspired by the well-known split-transform-aggregate design heuristic in the Inception building block, this paper proposes a Skip-Layer Inception Module (SLIM) that facilitates efficient learning of image synthesis models, and a same-layer variant (dubbed as SLIM too) as a stronger alternative to the well-known ResNeXts for image recognition. In SLIM, the input feature map is first split into a number of groups (e.g., 4).Each group is then transformed to a latent style vector(via channel-wise attention) and a latent spatial mask (via spatial attention). The learned latent masks and latent style vectors are aggregated to modulate the target feature map. For generative learning, SLIM is built on a recently proposed lightweight Generative Adversarial Networks (i.e., FastGANs) which present a skip-layer excitation(SLE) module. For few-shot image synthesis tasks, the proposed SLIM achieves better performance than the SLE work and other related methods. For one-shot image synthesis tasks, it shows stronger capability of preserving images structures than prior arts such as the SinGANs. For image classification tasks, the proposed SLIM is used as a drop-in replacement for convolution layers in ResNets (resulting in ResNeXt-like models) and achieves better accuracy in theImageNet-1000 dataset, with significantly smaller model complexity

Related articles: Most relevant | Search more
arXiv:1802.02207 [cs.CV] (Published 2018-01-22)
Automated dataset generation for image recognition using the example of taxonomy
arXiv:2105.01883 [cs.CV] (Published 2021-05-05)
RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition
arXiv:2103.16889 [cs.CV] (Published 2021-03-31)
Joint Learning of Neural Transfer and Architecture Adaptation for Image Recognition