An Efficient Pyramid Transformer Network for Cross-View Geo-Localization in Complex Terrains

Unmanned aerial vehicle (UAV) self-localization in complex environments is critical when global navigation satellite systems (GNSSs) are unreliable. Existing datasets, often limited to low-altitude urban scenes, hinder generalization. This study introduces Multi-UAV, a novel dataset with 17.4 k high...

Full description

Saved in:
Bibliographic Details
Main Authors: Chengjie Ju, Wangping Xu, Nanxing Chen, Enhui Zheng
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/9/5/379
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850127137503182848
author Chengjie Ju
Wangping Xu
Nanxing Chen
Enhui Zheng
author_facet Chengjie Ju
Wangping Xu
Nanxing Chen
Enhui Zheng
author_sort Chengjie Ju
collection DOAJ
description Unmanned aerial vehicle (UAV) self-localization in complex environments is critical when global navigation satellite systems (GNSSs) are unreliable. Existing datasets, often limited to low-altitude urban scenes, hinder generalization. This study introduces Multi-UAV, a novel dataset with 17.4 k high-resolution UAV–satellite image pairs from diverse terrains (urban, rural, mountainous, farmland, coastal) and altitudes across China, enhancing cross-view geolocalization research. We propose a lightweight value reduction pyramid transformer (VRPT) for efficient feature extraction and a residual feature pyramid network (RFPN) for multi-scale feature fusion. Using meter-level accuracy (MA@K) and relative distance score (RDS), VRPT achieves robust, high-precision localization across varied terrains, offering significant potential for resource-constrained UAV deployment.
format Article
id doaj-art-a6ffaff4677247c9a725744ee36b6bfd
institution OA Journals
issn 2504-446X
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Drones
spelling doaj-art-a6ffaff4677247c9a725744ee36b6bfd2025-08-20T02:33:44ZengMDPI AGDrones2504-446X2025-05-019537910.3390/drones9050379An Efficient Pyramid Transformer Network for Cross-View Geo-Localization in Complex TerrainsChengjie Ju0Wangping Xu1Nanxing Chen2Enhui Zheng3Department of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, ChinaDepartment of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, ChinaDepartment of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, ChinaDepartment of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, ChinaUnmanned aerial vehicle (UAV) self-localization in complex environments is critical when global navigation satellite systems (GNSSs) are unreliable. Existing datasets, often limited to low-altitude urban scenes, hinder generalization. This study introduces Multi-UAV, a novel dataset with 17.4 k high-resolution UAV–satellite image pairs from diverse terrains (urban, rural, mountainous, farmland, coastal) and altitudes across China, enhancing cross-view geolocalization research. We propose a lightweight value reduction pyramid transformer (VRPT) for efficient feature extraction and a residual feature pyramid network (RFPN) for multi-scale feature fusion. Using meter-level accuracy (MA@K) and relative distance score (RDS), VRPT achieves robust, high-precision localization across varied terrains, offering significant potential for resource-constrained UAV deployment.https://www.mdpi.com/2504-446X/9/5/379unmanned aerial vehiclegeo-localizationtransformer
spellingShingle Chengjie Ju
Wangping Xu
Nanxing Chen
Enhui Zheng
An Efficient Pyramid Transformer Network for Cross-View Geo-Localization in Complex Terrains
Drones
unmanned aerial vehicle
geo-localization
transformer
title An Efficient Pyramid Transformer Network for Cross-View Geo-Localization in Complex Terrains
title_full An Efficient Pyramid Transformer Network for Cross-View Geo-Localization in Complex Terrains
title_fullStr An Efficient Pyramid Transformer Network for Cross-View Geo-Localization in Complex Terrains
title_full_unstemmed An Efficient Pyramid Transformer Network for Cross-View Geo-Localization in Complex Terrains
title_short An Efficient Pyramid Transformer Network for Cross-View Geo-Localization in Complex Terrains
title_sort efficient pyramid transformer network for cross view geo localization in complex terrains
topic unmanned aerial vehicle
geo-localization
transformer
url https://www.mdpi.com/2504-446X/9/5/379
work_keys_str_mv AT chengjieju anefficientpyramidtransformernetworkforcrossviewgeolocalizationincomplexterrains
AT wangpingxu anefficientpyramidtransformernetworkforcrossviewgeolocalizationincomplexterrains
AT nanxingchen anefficientpyramidtransformernetworkforcrossviewgeolocalizationincomplexterrains
AT enhuizheng anefficientpyramidtransformernetworkforcrossviewgeolocalizationincomplexterrains
AT chengjieju efficientpyramidtransformernetworkforcrossviewgeolocalizationincomplexterrains
AT wangpingxu efficientpyramidtransformernetworkforcrossviewgeolocalizationincomplexterrains
AT nanxingchen efficientpyramidtransformernetworkforcrossviewgeolocalizationincomplexterrains
AT enhuizheng efficientpyramidtransformernetworkforcrossviewgeolocalizationincomplexterrains