Keypoint Description by Descriptor Fusion Using Autoencoders

Zhuang Dai¹, Xinghong Huang¹, Weinan Chen¹, Chuangbin Chen¹, Li He², Shuhuan Wen³, Hong Zhang²

¹Guangdong University of Technology
²University of Alberta
³Yanshan University

Details

10:00 - 10:15 | Mon 1 Jun | Room T2 | MoA02.4

Session: SLAM I

Full Text

Abstract

Keypoint matching is an important operation in computer vision and its applications such as visual simultaneous localization and mapping (SLAM) in robotics. This matching operation heavily depends on the descriptors of the keypoints, and it must be performed reliably when images undergo conditional changes such as those in illumination and viewpoint. In this paper, a descriptor fusion model (DFM) is proposed to create a robust keypoint descriptor by fusing CNN-based descriptors using autoencoders. Our DFM architecture can be adapted to either trained or pre-trained CNN models. Based on the performance of existing CNN descriptors, we choose HardNet and DenseNet169 as representatives of trained and pre-trained descriptors. Our proposed DFM is evaluated on the latest benchmark datasets in computer vision with challenging conditional changes. The experimental results show that DFM is able to achieve state-of-the-art performance, with the mean mAP that is 6.34% and 6.42% higher than HardNet and DenseNet169, respectively.