arXiv Analytics

Sign in

arXiv:1511.05045 [cs.CV]AbstractReferencesReviewsResources

Handcrafted Local Features are Convolutional Neural Networks

Zhenzhong Lan, Shoou-I Yu, Ming Lin, Bhiksha Raj, Alexander G. Hauptmann

Published 2015-11-16Version 1

In image and video classification research, handcrafted local features and learning based features are the chief reason for its considerable progress in the past decades. These two architectures were proposed roughly at the same time, and have flourished at overlapping stages of history, but are typically viewed as distinct approaches. In this paper, we emphasize their structural similarities and show how such a unified view help us in designing features that balance efficiency and effectiveness. As an example, we study the problem of developing an efficient motion feature for action recognition. We approach this problem by first showing that traditional handcrafted local features are Convolutional Neural Networks (CNNs) that can be efficiently trained but have limited modeling capacities. We then propose a two-stream Convolutional PCA-ISA model to enhance the modeling capacities of local feature pipelines at the same time keep the computational complexity to be low. Through customarily designed network structures for pixels and optical flow, our method reflect distinctive characteristics of these two data sources. We evaluate our proposed method on standard action recognition benchmarks of UCF101 and HMDB51, where it performs better than state-of-the-art CNN approaches in both training time and accuracy.

Related articles: Most relevant | Search more
arXiv:1606.04189 [cs.CV] (Published 2016-06-14)
Inverting face embeddings with convolutional neural networks
arXiv:1604.03168 [cs.CV] (Published 2016-04-11)
Hardware-oriented Approximation of Convolutional Neural Networks
arXiv:1604.02532 [cs.CV] (Published 2016-04-09)
T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos
Kai Kang et al.