DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing
The performance of visual SLAM in complex, real-world scenarios is often compromised by unreliable feature extraction and matching when using handcrafted features. Although deep learning-based local features excel at capturing high-level information and perform well on matching benchmarks, they stru...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/14/7838 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849406818857517056 |
|---|---|
| author | Hao Qu Lilian Zhang Jun Mao Junbo Tie Xiaofeng He Xiaoping Hu Yifei Shi Changhao Chen |
| author_facet | Hao Qu Lilian Zhang Jun Mao Junbo Tie Xiaofeng He Xiaoping Hu Yifei Shi Changhao Chen |
| author_sort | Hao Qu |
| collection | DOAJ |
| description | The performance of visual SLAM in complex, real-world scenarios is often compromised by unreliable feature extraction and matching when using handcrafted features. Although deep learning-based local features excel at capturing high-level information and perform well on matching benchmarks, they struggle with generalization in continuous motion scenes, adversely affecting loop detection accuracy. Our system employs a Model-Agnostic Meta-Learning (MAML) strategy to optimize the training of keypoint extraction networks, enhancing their adaptability to diverse environments. Additionally, we introduce a coarse-to-fine feature tracking mechanism for learned keypoints. It begins with a direct method to approximate the relative pose between consecutive frames, followed by a feature matching method for refined pose estimation. To mitigate cumulative positioning errors, DK-SLAM incorporates a novel online learning module that utilizes binary features for loop closure detection. This module dynamically identifies loop nodes within a sequence, ensuring accurate and efficient localization. Experimental evaluations on publicly available datasets demonstrate that DK-SLAM outperforms leading traditional and learning-based SLAM systems, such as ORB-SLAM3 and LIFT-SLAM. DK-SLAM achieves 17.7% better translation accuracy and 24.2% better rotation accuracy than ORB-SLAM3 on KITTI and 34.2% better translation accuracy on EuRoC. These results underscore the efficacy and robustness of our DK-SLAM in varied and challenging real-world environments. |
| format | Article |
| id | doaj-art-3eed326e99f0439e8a718193786ff7cd |
| institution | Kabale University |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-3eed326e99f0439e8a718193786ff7cd2025-08-20T03:36:15ZengMDPI AGApplied Sciences2076-34172025-07-011514783810.3390/app15147838DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop ClosingHao Qu0Lilian Zhang1Jun Mao2Junbo Tie3Xiaofeng He4Xiaoping Hu5Yifei Shi6Changhao Chen7College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaThe performance of visual SLAM in complex, real-world scenarios is often compromised by unreliable feature extraction and matching when using handcrafted features. Although deep learning-based local features excel at capturing high-level information and perform well on matching benchmarks, they struggle with generalization in continuous motion scenes, adversely affecting loop detection accuracy. Our system employs a Model-Agnostic Meta-Learning (MAML) strategy to optimize the training of keypoint extraction networks, enhancing their adaptability to diverse environments. Additionally, we introduce a coarse-to-fine feature tracking mechanism for learned keypoints. It begins with a direct method to approximate the relative pose between consecutive frames, followed by a feature matching method for refined pose estimation. To mitigate cumulative positioning errors, DK-SLAM incorporates a novel online learning module that utilizes binary features for loop closure detection. This module dynamically identifies loop nodes within a sequence, ensuring accurate and efficient localization. Experimental evaluations on publicly available datasets demonstrate that DK-SLAM outperforms leading traditional and learning-based SLAM systems, such as ORB-SLAM3 and LIFT-SLAM. DK-SLAM achieves 17.7% better translation accuracy and 24.2% better rotation accuracy than ORB-SLAM3 on KITTI and 34.2% better translation accuracy on EuRoC. These results underscore the efficacy and robustness of our DK-SLAM in varied and challenging real-world environments.https://www.mdpi.com/2076-3417/15/14/7838monocular SLAMdeep learningfeature extraction and matchingloop closing |
| spellingShingle | Hao Qu Lilian Zhang Jun Mao Junbo Tie Xiaofeng He Xiaoping Hu Yifei Shi Changhao Chen DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing Applied Sciences monocular SLAM deep learning feature extraction and matching loop closing |
| title | DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing |
| title_full | DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing |
| title_fullStr | DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing |
| title_full_unstemmed | DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing |
| title_short | DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking, and Loop Closing |
| title_sort | dk slam monocular visual slam with deep keypoint learning tracking and loop closing |
| topic | monocular SLAM deep learning feature extraction and matching loop closing |
| url | https://www.mdpi.com/2076-3417/15/14/7838 |
| work_keys_str_mv | AT haoqu dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing AT lilianzhang dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing AT junmao dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing AT junbotie dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing AT xiaofenghe dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing AT xiaopinghu dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing AT yifeishi dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing AT changhaochen dkslammonocularvisualslamwithdeepkeypointlearningtrackingandloopclosing |