DP-Loc: Visual Localization in 2D Maps Using an Embedded Depth Prior
Recent advancements in cost-effective image-based localization using 2D maps have garnered significant attention, inspired by humans’ ability to navigate with such maps. This study addresses the limitations of monocular vision-based systems, specifically inaccurate depth information and l...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10772238/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846129710993506304 |
|---|---|
| author | Kyoung Eun Kim Joo Yong Sim |
| author_facet | Kyoung Eun Kim Joo Yong Sim |
| author_sort | Kyoung Eun Kim |
| collection | DOAJ |
| description | Recent advancements in cost-effective image-based localization using 2D maps have garnered significant attention, inspired by humans’ ability to navigate with such maps. This study addresses the limitations of monocular vision-based systems, specifically inaccurate depth information and loss of geometric details, which hinder precise localization. We propose a novel neural network framework that incorporates a pretrained metric depth estimation model, such as Zoedepth, to accurately measure absolute distances and enhance map matching between 2D maps and images. Our approach introduces two key modules: an Explicit Depth Prior Fusion (EDPF) module, which constructs a depth score volume using depth maps, and an Implicit Depth Prior Fusion (IDPF) module, which integrates depth and semantic features early through positional encoding. These modules enable a single-layer-scale classifier to learn essential features for effective localization. Notably, the IDPF model with positional encoding showed over 10% performance improvement on the Mapillary dataset compared to the baseline, underscoring the advantages of combining semantic and geometric information. The proposed DP-Loc approach provides a cost-efficient solution for visual localization by leveraging publicly accessible 2D maps and monocular image inputs, making it applicable to autonomous driving, robotics, and augmented reality. |
| format | Article |
| id | doaj-art-c8075dba132245c38682447a201577ce |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-c8075dba132245c38682447a201577ce2024-12-10T00:01:49ZengIEEEIEEE Access2169-35362024-01-011218157018157810.1109/ACCESS.2024.351004610772238DP-Loc: Visual Localization in 2D Maps Using an Embedded Depth PriorKyoung Eun Kim0https://orcid.org/0009-0007-5570-5170Joo Yong Sim1https://orcid.org/0000-0003-3779-7589Department of Mechanical Systems Engineering, Sookmyung Women’s University, Seoul, Republic of KoreaDepartment of Mechanical Systems Engineering, Sookmyung Women’s University, Seoul, Republic of KoreaRecent advancements in cost-effective image-based localization using 2D maps have garnered significant attention, inspired by humans’ ability to navigate with such maps. This study addresses the limitations of monocular vision-based systems, specifically inaccurate depth information and loss of geometric details, which hinder precise localization. We propose a novel neural network framework that incorporates a pretrained metric depth estimation model, such as Zoedepth, to accurately measure absolute distances and enhance map matching between 2D maps and images. Our approach introduces two key modules: an Explicit Depth Prior Fusion (EDPF) module, which constructs a depth score volume using depth maps, and an Implicit Depth Prior Fusion (IDPF) module, which integrates depth and semantic features early through positional encoding. These modules enable a single-layer-scale classifier to learn essential features for effective localization. Notably, the IDPF model with positional encoding showed over 10% performance improvement on the Mapillary dataset compared to the baseline, underscoring the advantages of combining semantic and geometric information. The proposed DP-Loc approach provides a cost-efficient solution for visual localization by leveraging publicly accessible 2D maps and monocular image inputs, making it applicable to autonomous driving, robotics, and augmented reality.https://ieeexplore.ieee.org/document/10772238/Image-based map matchingvisual localizationmonocular depth estimationfeature fusion |
| spellingShingle | Kyoung Eun Kim Joo Yong Sim DP-Loc: Visual Localization in 2D Maps Using an Embedded Depth Prior IEEE Access Image-based map matching visual localization monocular depth estimation feature fusion |
| title | DP-Loc: Visual Localization in 2D Maps Using an Embedded Depth Prior |
| title_full | DP-Loc: Visual Localization in 2D Maps Using an Embedded Depth Prior |
| title_fullStr | DP-Loc: Visual Localization in 2D Maps Using an Embedded Depth Prior |
| title_full_unstemmed | DP-Loc: Visual Localization in 2D Maps Using an Embedded Depth Prior |
| title_short | DP-Loc: Visual Localization in 2D Maps Using an Embedded Depth Prior |
| title_sort | dp loc visual localization in 2d maps using an embedded depth prior |
| topic | Image-based map matching visual localization monocular depth estimation feature fusion |
| url | https://ieeexplore.ieee.org/document/10772238/ |
| work_keys_str_mv | AT kyoungeunkim dplocvisuallocalizationin2dmapsusinganembeddeddepthprior AT jooyongsim dplocvisuallocalizationin2dmapsusinganembeddeddepthprior |