arXiv Analytics

Sign in

arXiv:2006.01250 [cs.CV]AbstractReferencesReviewsResources

Learning to Detect 3D Objects from Point Clouds in Real Time

Abhinav Sagar

Published 2020-05-09Version 1

In this paper, we present a combined architecture using dilated and transposed convolutional neural networks for accurate and efficient semantic image segmentation. In contrast to previous fully convolutional neural networks such as FCN with almost all computation shared on the entire image, we propose an additional architecture which we have named as dilated - transposed fully convolutional neural networks. To achieve this goal, we used dilated convolutional layers in downsampling and transposed convolutional layers in upsampling layers. We have used skip connections in between the blocks formed by convolutions and max pooling layers. This type of architecture has been used successfully in the past for image classification using residual network. In addition we also found selu activation function instead of relu to give better results on the test set images. We reason this is the due to avoiding the model getting stuck in a local minimum, thus experiencing a famous vanishing gradient problem in case with relu activation function. Meanwhile, our result achieved pixel wise class accuracy of 88% on the test set and mean Intersection Over Union(IOU) value of 53.5 which is better than the state of the art using the previous fully convolutional neural networks.

Related articles: Most relevant | Search more
arXiv:1803.06199 [cs.CV] (Published 2018-03-16)
Complex-YOLO: Real-time 3D Object Detection on Point Clouds
arXiv:2012.00656 [cs.CV] (Published 2020-12-01)
Cross-modal registration using point clouds and graph-matching in the context of correlative microscopies
arXiv:1708.03276 [cs.CV] (Published 2017-08-10)
Document Image Binarization with Fully Convolutional Neural Networks