arXiv:1801.05918 Abstract | arXiv Analytics

arXiv:1801.05918 [cs.CV]Abstract References Reviews Resources

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Published 2018-01-18Version 1

Single Shot MultiBox Detector (SSD) is one of the fastest algorithms in the current object detection field, which uses fully convolutional neural network to detect all scaled objects in an image. Deconvolutional Single Shot Detector (DSSD) is an approach which introduces more context information by adding the deconvolution module to SSD. And the mean Average Precision (mAP) of DSSD on PASCAL VOC2007 is improved from SSD's 77.5% to 78.6%. Although DSSD obtains higher mAP than SSD by 1.1%, the frames per second (FPS) decreases from 46 to 11.8. In this paper, we propose a single stage end-to-end image detection model called ESSD to overcome this dilemma. Our solution to this problem is to cleverly extend better context information for the shallow layers of the best single stage (e.g. SSD) detectors. Experimental results show that our model can reach 79.4% mAP, which is higher than DSSD and SSD by 0.8 and 1.9 points respectively. Meanwhile, our testing speed is 25 FPS in Titan X GPU which is more than double the original DSSD.

Comments: 7 pages, 3 figures, 3 tables

Categories: cs.CV

Keywords: single shot multibox detector, convolutional neural network, stage end-to-end image detection, extend better context information, end-to-end image detection model

Related articles: Most relevant | Search more

arXiv:1601.07532 [cs.CV] (Published 2016-01-27)

Learning to Extract Motion from Videos in Convolutional Neural Networks

Damien Teney, Martial Hebert

arXiv:1409.4326 [cs.CV] (Published 2014-09-15)

Computing the Stereo Matching Cost with a Convolutional Neural Network

Jure Žbontar, Yann LeCun

arXiv:1504.02351 [cs.CV] (Published 2015-04-09)

When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition