OBTPN: A Vision-Based Network for UAV Geo-Localization in Multi-Altitude Environments

UAVs typically rely on satellite navigation for positioning, yet this method proves ineffective in instances where the signal is inadequate or communication is disrupted. Visually based positioning technology has emerged as a reliable alternative. In this paper, we propose a novel end-to-end network...

Full description

Saved in:
Bibliographic Details
Main Authors: Nanxing Chen, Jiqi Fan, Jiayu Yuan, Enhui Zheng
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/9/1/33
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:UAVs typically rely on satellite navigation for positioning, yet this method proves ineffective in instances where the signal is inadequate or communication is disrupted. Visually based positioning technology has emerged as a reliable alternative. In this paper, we propose a novel end-to-end network, OBTPN. In the initial phase of the model, we optimized the distribution of attention within the primary network, aiming to achieve a balance between self-attention and cross-attention. Subsequently, we devised a feature fusion head, which enhanced the model’s capacity to process multi-scale information. OBTPN was successfully deployed on an NVIDIA Jetson TX2 onboard computer. This paper also proposes a high-altitude complex environment dataset, Crossview9, which addresses a research gap in the field of high-altitude visual navigation. The performance of the model on this dataset is also evaluated. Additionally, the dataset was processed to simulate a low-quality image environment to assess the model’s resilience in challenging weather conditions. The experimental results demonstrate that OBTPN_256 attains an accuracy of 84.55% on the RDS metric, thereby reaching the state-of-the-art (SOTA) level of the UL14 dataset. On the Crossview9 dataset, OBTPN_256 achieves an RDS score of 79.76%, also reaching the SOTA level. Most notably, the model’s high accuracy in low-quality image environments further substantiates its robustness in complex environments.
ISSN:2504-446X