{ "id": "1510.08160", "version": "v1", "published": "2015-10-28T01:59:14.000Z", "updated": "2015-10-28T01:59:14.000Z", "title": "Scale-aware Fast R-CNN for Pedestrian Detection", "authors": [ "Jianan Li", "Xiaodan Liang", "ShengMei Shen", "Tingfa Xu", "Shuicheng Yan" ], "categories": [ "cs.CV" ], "abstract": "While convolutional neural network (CNN) architectures have achieved great success in various vision tasks, the critical scale problem is still much under-explored, especially for pedestrian detection. Current approaches mainly focus on using large numbers of training images with different scales to improve the network capability or result fusions by multi-scale crops of images during testing. Designing a CNN architecture that can intrinsically capture the characteristics of large-scale and small-scale objects and also retain the scale invariance property is still a very challenging problem. In this paper, we propose a novel scale-aware Fast R-CNN to handle the detection of small object instances which are very common in pedestrian detection. Our architecture incorporates a large-scale sub-network and a small-scale sub-network into a unified architecture by leveraging the scale-aware weighting during training. The heights of object proposals are utilized to specify different scale-aware weights for the two sub-networks. Extensive evaluations on the challenging Caltech~\\cite{dollar2012pedestrian} demonstrate the superiority of the proposed architecture over the state-of-the-art methods~\\cite{compact,ta_cnn}. In particular, the miss rate on the Caltech dataset is reduced to $9.68\\%$ by our method, significantly smaller than $11.75\\%$ by CompACT-Deep~\\cite{compact} and $20.86\\%$ by TA-CNN~\\cite{ta_cnn}.", "revisions": [ { "version": "v1", "updated": "2015-10-28T01:59:14.000Z" } ], "analyses": { "keywords": [ "pedestrian detection", "architecture", "novel scale-aware fast r-cnn", "sub-network", "convolutional neural network" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable", "adsabs": "2015arXiv151008160L" } } }