arXiv:2006.01250 Abstract | arXiv Analytics

arXiv:2006.01250 [cs.CV]Abstract References Reviews Resources

Learning to Detect 3D Objects from Point Clouds in Real Time

Published 2020-05-09Version 1

In this paper, we present a combined architecture using dilated and transposed convolutional neural networks for accurate and efficient semantic image segmentation. In contrast to previous fully convolutional neural networks such as FCN with almost all computation shared on the entire image, we propose an additional architecture which we have named as dilated - transposed fully convolutional neural networks. To achieve this goal, we used dilated convolutional layers in downsampling and transposed convolutional layers in upsampling layers. We have used skip connections in between the blocks formed by convolutions and max pooling layers. This type of architecture has been used successfully in the past for image classification using residual network. In addition we also found selu activation function instead of relu to give better results on the test set images. We reason this is the due to avoiding the model getting stuck in a local minimum, thus experiencing a famous vanishing gradient problem in case with relu activation function. Meanwhile, our result achieved pixel wise class accuracy of 88% on the test set and mean Intersection Over Union(IOU) value of 53.5 which is better than the state of the art using the previous fully convolutional neural networks.

Comments: 13 pages

Categories: cs.CV, cs.LG, eess.IV

Keywords: fully convolutional neural networks, detect 3d objects, point clouds, real time, pixel wise class accuracy

Related articles: Most relevant | Search more

arXiv:1803.06199 [cs.CV] (Published 2018-03-16)

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Martin Simon, Stefan Milz, Karl Amende, Horst-Michael Gross

arXiv:2012.00656 [cs.CV] (Published 2020-12-01)

Cross-modal registration using point clouds and graph-matching in the context of correlative microscopies

Stephan Kunne, Guillaume Potier, Jean Mérot, Perrine Paul-Gilloteaux

arXiv:1708.03276 [cs.CV] (Published 2017-08-10)

Document Image Binarization with Fully Convolutional Neural Networks