arXiv:1510.08160 [cs.CV]AbstractReferencesReviewsResources
Scale-aware Fast R-CNN for Pedestrian Detection
Jianan Li, Xiaodan Liang, ShengMei Shen, Tingfa Xu, Shuicheng Yan
Published 2015-10-28Version 1
While convolutional neural network (CNN) architectures have achieved great success in various vision tasks, the critical scale problem is still much under-explored, especially for pedestrian detection. Current approaches mainly focus on using large numbers of training images with different scales to improve the network capability or result fusions by multi-scale crops of images during testing. Designing a CNN architecture that can intrinsically capture the characteristics of large-scale and small-scale objects and also retain the scale invariance property is still a very challenging problem. In this paper, we propose a novel scale-aware Fast R-CNN to handle the detection of small object instances which are very common in pedestrian detection. Our architecture incorporates a large-scale sub-network and a small-scale sub-network into a unified architecture by leveraging the scale-aware weighting during training. The heights of object proposals are utilized to specify different scale-aware weights for the two sub-networks. Extensive evaluations on the challenging Caltech~\cite{dollar2012pedestrian} demonstrate the superiority of the proposed architecture over the state-of-the-art methods~\cite{compact,ta_cnn}. In particular, the miss rate on the Caltech dataset is reduced to $9.68\%$ by our method, significantly smaller than $11.75\%$ by CompACT-Deep~\cite{compact} and $20.86\%$ by TA-CNN~\cite{ta_cnn}.