DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing

The performance of visual SLAM in complex, real-world scenarios is often compromised by unreliable feature extraction and matching when using handcrafted features. Although deep learning-based local features excel at capturing high-level information and perform well on matching benchmarks, they stru...

Full description

Saved in:
Bibliographic Details
Main Authors: Hao Qu, Lilian Zhang, Jun Mao, Junbo Tie, Xiaofeng He, Xiaoping Hu, Yifei Shi, Changhao Chen
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/14/7838
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849406818857517056
author Hao Qu
Lilian Zhang
Jun Mao
Junbo Tie
Xiaofeng He
Xiaoping Hu
Yifei Shi
Changhao Chen
author_facet Hao Qu
Lilian Zhang
Jun Mao
Junbo Tie
Xiaofeng He
Xiaoping Hu
Yifei Shi
Changhao Chen
author_sort Hao Qu
collection DOAJ
description The performance of visual SLAM in complex, real-world scenarios is often compromised by unreliable feature extraction and matching when using handcrafted features. Although deep learning-based local features excel at capturing high-level information and perform well on matching benchmarks, they struggle with generalization in continuous motion scenes, adversely affecting loop detection accuracy. Our system employs a Model-Agnostic Meta-Learning (MAML) strategy to optimize the training of keypoint extraction networks, enhancing their adaptability to diverse environments. Additionally, we introduce a coarse-to-fine feature tracking mechanism for learned keypoints. It begins with a direct method to approximate the relative pose between consecutive frames, followed by a feature matching method for refined pose estimation. To mitigate cumulative positioning errors, DK-SLAM incorporates a novel online learning module that utilizes binary features for loop closure detection. This module dynamically identifies loop nodes within a sequence, ensuring accurate and efficient localization. Experimental evaluations on publicly available datasets demonstrate that DK-SLAM outperforms leading traditional and learning-based SLAM systems, such as ORB-SLAM3 and LIFT-SLAM. DK-SLAM achieves 17.7% better translation accuracy and 24.2% better rotation accuracy than ORB-SLAM3 on KITTI and 34.2% better translation accuracy on EuRoC. These results underscore the efficacy and robustness of our DK-SLAM in varied and challenging real-world environments.
format Article
id doaj-art-3eed326e99f0439e8a718193786ff7cd
institution Kabale University
issn 2076-3417
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-3eed326e99f0439e8a718193786ff7cd2025-08-20T03:36:15ZengMDPI AGApplied Sciences2076-34172025-07-011514783810.3390/app15147838DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop ClosingHao Qu0Lilian Zhang1Jun Mao2Junbo Tie3Xiaofeng He4Xiaoping Hu5Yifei Shi6Changhao Chen7College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaThe performance of visual SLAM in complex, real-world scenarios is often compromised by unreliable feature extraction and matching when using handcrafted features. Although deep learning-based local features excel at capturing high-level information and perform well on matching benchmarks, they struggle with generalization in continuous motion scenes, adversely affecting loop detection accuracy. Our system employs a Model-Agnostic Meta-Learning (MAML) strategy to optimize the training of keypoint extraction networks, enhancing their adaptability to diverse environments. Additionally, we introduce a coarse-to-fine feature tracking mechanism for learned keypoints. It begins with a direct method to approximate the relative pose between consecutive frames, followed by a feature matching method for refined pose estimation. To mitigate cumulative positioning errors, DK-SLAM incorporates a novel online learning module that utilizes binary features for loop closure detection. This module dynamically identifies loop nodes within a sequence, ensuring accurate and efficient localization. Experimental evaluations on publicly available datasets demonstrate that DK-SLAM outperforms leading traditional and learning-based SLAM systems, such as ORB-SLAM3 and LIFT-SLAM. DK-SLAM achieves 17.7% better translation accuracy and 24.2% better rotation accuracy than ORB-SLAM3 on KITTI and 34.2% better translation accuracy on EuRoC. These results underscore the efficacy and robustness of our DK-SLAM in varied and challenging real-world environments.https://www.mdpi.com/2076-3417/15/14/7838monocular SLAMdeep learningfeature extraction and matchingloop closing
spellingShingle Hao Qu
Lilian Zhang
Jun Mao
Junbo Tie
Xiaofeng He
Xiaoping Hu
Yifei Shi
Changhao Chen
DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing
Applied Sciences
monocular SLAM
deep learning
feature extraction and matching
loop closing
title DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing
title_full DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing
title_fullStr DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing
title_full_unstemmed DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing
title_short DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing
title_sort dk slam monocular visual slam with deep keypoint learning tracking and loop closing
topic monocular SLAM
deep learning
feature extraction and matching
loop closing
url https://www.mdpi.com/2076-3417/15/14/7838
work_keys_str_mv AT haoqu dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing
AT lilianzhang dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing
AT junmao dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing
AT junbotie dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing
AT xiaofenghe dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing
AT xiaopinghu dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing
AT yifeishi dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing
AT changhaochen dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing