Back to the Context: Tuning Visual Localization Using Structural and Edge Context Within Image

Visual localization has become a crucial task in robotics, especially in autonomous vehicles and virtual reality, due to its ability to utilize inexpensive sensors and achieve high accuracy. Among various methods, the scene coordinate regression network is a recent approach. This method uses a neura...

Full description

Saved in:

Bibliographic Details
Main Authors:	Nanda Febri Istighfarin, Seongwon Lee, HyungGi Jo
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Visual localization pose estimation sampling module edge detector structural context attention
Online Access:	https://ieeexplore.ieee.org/document/10723272/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846127488567083008
author	Nanda Febri Istighfarin Seongwon Lee HyungGi Jo
author_facet	Nanda Febri Istighfarin Seongwon Lee HyungGi Jo
author_sort	Nanda Febri Istighfarin
collection	DOAJ
description	Visual localization has become a crucial task in robotics, especially in autonomous vehicles and virtual reality, due to its ability to utilize inexpensive sensors and achieve high accuracy. Among various methods, the scene coordinate regression network is a recent approach. This method uses a neural network to regress the 2D-3D correspondences from images and utilizes these correspondences in a pose solver like PnP-RANSAC to estimate the pose of the query image. A common challenge is that regressing these correspondences often involves sampling across the entire 2D image, which is inefficient as not all areas contain useful information for the network. To address this, we propose sampling only the essential regions of an image to enhance the network’s learning efficiency. Our method selectively captures informative features by integrating the structural and edge contexts within images, identifying robust regions for sampling. This refinement allows the network to learn 2D-3D correspondences better. We tested our approach using both the publicly available outdoor dataset and our custom dataset, where it achieved state-of-the-art results in a large dataset.
format	Article
id	doaj-art-e809f62ed7cf4c388f0a51f00d721566
institution	Kabale University
issn	2169-3536
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-e809f62ed7cf4c388f0a51f00d7215662024-12-12T00:00:46ZengIEEEIEEE Access2169-35362024-01-011215496315497410.1109/ACCESS.2024.348396310723272Back to the Context: Tuning Visual Localization Using Structural and Edge Context Within ImageNanda Febri Istighfarin0https://orcid.org/0009-0007-3720-4356Seongwon Lee1https://orcid.org/0000-0002-7077-5595HyungGi Jo2https://orcid.org/0000-0003-2689-1940Division of Electronic Engineering, Jeonbuk National University, Jeonju, South KoreaSchool of Electrical Engineering, Kookmin University, Seoul, South KoreaDivision of Electronic Engineering, Jeonbuk National University, Jeonju, South KoreaVisual localization has become a crucial task in robotics, especially in autonomous vehicles and virtual reality, due to its ability to utilize inexpensive sensors and achieve high accuracy. Among various methods, the scene coordinate regression network is a recent approach. This method uses a neural network to regress the 2D-3D correspondences from images and utilizes these correspondences in a pose solver like PnP-RANSAC to estimate the pose of the query image. A common challenge is that regressing these correspondences often involves sampling across the entire 2D image, which is inefficient as not all areas contain useful information for the network. To address this, we propose sampling only the essential regions of an image to enhance the network’s learning efficiency. Our method selectively captures informative features by integrating the structural and edge contexts within images, identifying robust regions for sampling. This refinement allows the network to learn 2D-3D correspondences better. We tested our approach using both the publicly available outdoor dataset and our custom dataset, where it achieved state-of-the-art results in a large dataset.https://ieeexplore.ieee.org/document/10723272/Visual localizationpose estimationsampling moduleedge detectorstructural contextattention
spellingShingle	Nanda Febri Istighfarin Seongwon Lee HyungGi Jo Back to the Context: Tuning Visual Localization Using Structural and Edge Context Within Image IEEE Access Visual localization pose estimation sampling module edge detector structural context attention
title	Back to the Context: Tuning Visual Localization Using Structural and Edge Context Within Image
title_full	Back to the Context: Tuning Visual Localization Using Structural and Edge Context Within Image
title_fullStr	Back to the Context: Tuning Visual Localization Using Structural and Edge Context Within Image
title_full_unstemmed	Back to the Context: Tuning Visual Localization Using Structural and Edge Context Within Image
title_short	Back to the Context: Tuning Visual Localization Using Structural and Edge Context Within Image
title_sort	back to the context tuning visual localization using structural and edge context within image
topic	Visual localization pose estimation sampling module edge detector structural context attention
url	https://ieeexplore.ieee.org/document/10723272/
work_keys_str_mv	AT nandafebriistighfarin backtothecontexttuningvisuallocalizationusingstructuralandedgecontextwithinimage AT seongwonlee backtothecontexttuningvisuallocalizationusingstructuralandedgecontextwithinimage AT hyunggijo backtothecontexttuningvisuallocalizationusingstructuralandedgecontextwithinimage

Back to the Context: Tuning Visual Localization Using Structural and Edge Context Within Image

Similar Items