Search ResultsShowing 1-9 of 9
-
arXiv:1907.00330 (Published 2019-06-30)
Visual Space Optimization for Zero-shot Learning
Categories: cs.CVZero-shot learning, which aims to recognize new categories that are not included in the training set, has gained popularity owing to its potential ability in the real-word applications. Zero-shot learning models rely on learning an embedding space, where both semantic descriptions of classes and visual features of instances can be embedded for nearest neighbor search. Recently, most of the existing works consider the visual space formulated by deep visual features as an ideal choice of the embedding space. However, the discrete distribution of instances in the visual space makes the data structure unremarkable. We argue that optimizing the visual space is crucial as it allows semantic vectors to be embedded into the visual space more effectively. In this work, we propose two strategies to accomplish this purpose. One is the visual prototype based method, which learns a visual prototype for each visual class, so that, in the visual space, a class can be represented by a prototype feature instead of a series of discrete visual features. The other is to optimize the visual feature structure in an intermediate embedding space, and in this method we successfully devise a multilayer perceptron framework based algorithm that is able to learn the common intermediate embedding space and meanwhile to make the visual data structure more distinctive. Through extensive experimental evaluation on four benchmark datasets, we demonstrate that optimizing visual space is beneficial for zero-shot learning. Besides, the proposed prototype based method achieves the new state-of-the-art performance.
-
arXiv:1811.10026 (Published 2018-11-25)
Multi-view Point Cloud Registration with Adaptive Convergence Threshold and its Application on 3D Model Retrieval
Categories: cs.CVMulti-view point cloud registration is a hot topic in the communities of multimedia technology and artificial intelligence (AI). In this paper, we propose a framework to reconstruct the 3D models by the multi-view point cloud registration algorithm with adaptive convergence threshold, and subsequently apply it to 3D model retrieval. The iterative closest point (ICP) algorithm is implemented combining with the motion average algorithm for the registration of multi-view point clouds. After the registration process, we design applications for 3D model retrieval. The geometric saliency map is computed based on the vertex curvature. The test facial triangle is then generated based on the saliency map, which is applied to compare with the standard facial triangle. The face and non-face models are then discriminated. The experiments and comparisons prove the effectiveness of the proposed framework.
-
arXiv:1811.09790 (Published 2018-11-24)
Spatio-Temporal Road Scene Reconstruction using Superpixel MRF
Categories: cs.CVScene models construction based on image rendering is a hot topic in the computer vision community. In this paper, we propose a framework to construct road scene models based on 3D corridor structures. The construction of scene models consists of two successive stages: road detection and scene construction. The road detection is implemented via a new superpixel Markov random field (MRF) algorithm. The data fidelity term of the energy function is jointly computed using the superpixel features of color, texture and location. The smoothness term is defined by the interaction of spatio-temporally adjacent superpixels. The control points of road boundaries are generated with the constraint of vanishing point. Subsequently, the road scene models are constructed, where the foreground and background regions are modeled independently. Numerous applications are developed based on the proposed framework, e.g., traffic scenes simulation. The experiments and comparisons are conducted for both the road detection and scene construction stages, which prove the effectiveness of the proposed method.
-
arXiv:1803.07360 (Published 2018-03-20)
Adaptive Co-weighting Deep Convolutional Features For Object Retrieval
Comments: 6 pages,5 figures,ICME2018 posterCategories: cs.CVAggregating deep convolutional features into a global image vector has attracted sustained attention in image retrieval. In this paper, we propose an efficient unsupervised aggregation method that uses an adaptive Gaussian filter and an elementvalue sensitive vector to co-weight deep features. Specifically, the Gaussian filter assigns large weights to features of region-of-interests (RoI) by adaptively determining the RoI's center, while the element-value sensitive channel vector suppresses burstiness phenomenon by assigning small weights to feature maps with large sum values of all locations. Experimental results on benchmark datasets validate the proposed two weighting schemes both effectively improve the discrimination power of image vectors. Furthermore, with the same experimental setting, our method outperforms other very recent aggregation approaches by a considerable margin.
-
arXiv:1710.05193 (Published 2017-10-14)
K-means clustering for efficient registration of multi-view point sets
Categories: cs.CVThis paper cast the multi-view registration into a clustering problem, which can be solved by the proposed approach based on the K-means clustering. For the clustering, all the centroids are uniformly sampled from the initially aligned point sets involved in the multi-view registration. Then, two standard K-means steps are utilized to assign all points to one special cluster and update each cluster centroid. Subsequently, the shape comprised by all cluster centroids can be used to sequentially estimate the rigid transformation for each point set. To obtain accurate results, the K-means steps and transformation estimation should be alternately and iteratively applied to all point sets. Finally, the proposed approach was tested on some public data sets and compared with the-state-of-art algorithms. Experimental results illustrate its good efficiency for the registration of multi-view point sets.
-
arXiv:1709.08393 (Published 2017-09-25)
Multi-view Registration Based on Weighted Low Rank and Sparse Matrix Decomposition of Motions
Categories: cs.CVRecently, the low rank and sparse (LRS) matrix decomposition has been introduced as an effective mean to solve the multi-view registration. However, this method presents two notable disadvantages: the registration result is quite sensitive to the sparsity of the LRS matrix; besides, the decomposition process treats each block element equally in spite of their reliability. Therefore, this paper firstly proposes a matrix completion method based on the overlap percentage of scan pairs. By completing the LRS matrix with reliable block elements as much as possible, more synchronization constraints of relative motions can be utilized for registration. Furthermore, it is observed that the reliability of each element in the LRS matrix can be weighed by the relationship between its corresponding model and data shapes. Therefore, a weight matrix is designed to measure the contribution of each element to decomposition and accordingly, the decomposition result is closer to the ground truth than before. Benefited from the more informative LRS matrix as well as the weight matrix, experimental results conducted on several public datasets demonstrate the superiority of the proposed approach over other methods on both accuracy and robustness.
-
arXiv:1706.00227 (Published 2017-06-01)
An Effective Approach for Point Clouds Registration Based on the Hard and Soft Assignments
Comments: 23 pages, 6 figures, 2 tablesCategories: cs.CVFor the registration of partially overlapping point clouds, this paper proposes an effective approach based on both the hard and soft assignments. Given two initially posed clouds, it firstly establishes the forward correspondence for each point in the data shape and calculates the value of binary variable, which can indicate whether this point correspondence is located in the overlapping areas or not. Then, it establishes the bilateral correspondence and computes bidirectional distances for each point in the overlapping areas. Based on the ratio of bidirectional distances, the exponential function is selected and utilized to calculate the probability value, which can indicate the reliability of the point correspondence. Subsequently, both the values of hard and soft assignments are embedded into the proposed objective function for registration of partially overlapping point clouds and a novel variant of ICP algorithm is proposed to obtain the optimal rigid transformation. The proposed approach can achieve good registration of point clouds, even when their overlap percentage is low. Experimental results tested on public data sets illustrate its superiority over previous approaches on accuracy and robustness.
-
arXiv:1705.00086 (Published 2017-04-28)
Effective scaling registration approach by imposing the emphasis on the scale factor
Comments: 22 pages, 8 figures, 2 tablesCategories: cs.CVThis paper proposes an effective approach for the scaling registration of $m$-D point sets. Different from the rigid transformation, the scaling registration can not be formulated into the common least square function due to the ill-posed problem caused by the scale factor. Therefore, this paper designs a novel objective function for the scaling registration problem. The appearance of this objective function is a rational fraction, where the numerator item is the least square error and the denominator item is the square of the scale factor. By imposing the emphasis on scale factor, the ill-posed problem can be avoided in the scaling registration. Subsequently, the new objective function can be solved by the proposed scaling iterative closest point (ICP) algorithm, which can obtain the optimal scaling transformation. For the practical applications, the scaling ICP algorithm is further extended to align partially overlapping point sets. Finally, the proposed approach is tested on public data sets and applied to merging grid maps of different resolutions. Experimental results demonstrate its superiority over previous approaches on efficiency and robustness.
-
arXiv:1702.06264 (Published 2017-02-21)
Weighted Motion Averaging for the Registration of Multi-View Range Scans
Comments: 9 pages, 6 figures, 2tablesCategories: cs.CVMulti-view registration is a fundamental but challenging problem in 3D reconstruction and robot vision. Although the original motion averaging algorithm has been introduced as an effective means to solve the multi-view registration problem, it does not consider the reliability and accuracy of each relative motion. Accordingly, this paper proposes a novel motion averaging algorithm for multi-view registration. Firstly, it utilizes the pair-wise registration algorithm to estimate the relative motion and overlapping percentage of each scan pair with a certain degree of overlap. With the overlapping percentage available, it views the overlapping percentage as the corresponding weight of each scan pair and proposes the weight motion averaging algorithm, which can pay more attention to reliable and accurate relative motions. By treating each relative motion distinctively, more accurate registration can be achieved by applying the weighted motion averaging to multi-view range scans. Experimental results demonstrate the superiority of our proposed approach compared with the state-of-the-art methods in terms of accuracy, robustness and efficiency.