Camera Absolute Pose Estimation Using Hierarchical Attention in Multi-Scene

The multi-scene camera pose estimation approach aims to recover the camera pose from any given scene, catering to the demands of real-life mobile devices to perform tasks. Facing the challenge that it is difficult to extract efficient features in training multi-scene models, we present a modified mo...

Full description

Saved in:
Bibliographic Details
Main Authors: Xinhua Lu, Jingui Miao, Qingji Xue, Hui Wan, Hao Zhang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10847815/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832542601458221056
author Xinhua Lu
Jingui Miao
Qingji Xue
Hui Wan
Hao Zhang
author_facet Xinhua Lu
Jingui Miao
Qingji Xue
Hui Wan
Hao Zhang
author_sort Xinhua Lu
collection DOAJ
description The multi-scene camera pose estimation approach aims to recover the camera pose from any given scene, catering to the demands of real-life mobile devices to perform tasks. Facing the challenge that it is difficult to extract efficient features in training multi-scene models, we present a modified model named Hierarchical Attention Absolute Pose Regression(H-AttnAPR) which can obtain different scales of feature dependencies. A Hierarchical Attention(HA) module is introduced prior to the scene classification module, where it captures both intra- and inter-correlations among image patches, utilizing both local and global key information from images to restore the absolute camera pose without the need for additional point cloud data. H-AttnAPR efficiently models global dependencies without compromising fine-grained feature information. Therefore, it overcomes the limitations that solely focus on long-range pixel-level feature dependencies within images while neglecting local patches of image feature dependencies. Our approach has been validated on the 7Scenes and Cambridge benchmark datasets. Compared to the baseline algorithm PoseNet, our algorithm has achieved a 41.1% reduction in translation error and a 61.1% decrease in rotation error, demonstrating superior performance in multi-scene absolute camera pose regression.
format Article
id doaj-art-b340b16eb1044fa888bf0f1f6c023cb7
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-b340b16eb1044fa888bf0f1f6c023cb72025-02-04T00:00:42ZengIEEEIEEE Access2169-35362025-01-0113196241963410.1109/ACCESS.2025.353189610847815Camera Absolute Pose Estimation Using Hierarchical Attention in Multi-SceneXinhua Lu0https://orcid.org/0000-0002-2338-7020Jingui Miao1https://orcid.org/0009-0004-0506-9063Qingji Xue2Hui Wan3https://orcid.org/0009-0001-3823-6399Hao Zhang4https://orcid.org/0009-0002-2784-1355Electronic Information Institute, Nanyang Institute of Technology, Nanyang, ChinaSchool of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, ChinaElectronic Information Institute, Nanyang Institute of Technology, Nanyang, ChinaSchool of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, ChinaSchool of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, ChinaThe multi-scene camera pose estimation approach aims to recover the camera pose from any given scene, catering to the demands of real-life mobile devices to perform tasks. Facing the challenge that it is difficult to extract efficient features in training multi-scene models, we present a modified model named Hierarchical Attention Absolute Pose Regression(H-AttnAPR) which can obtain different scales of feature dependencies. A Hierarchical Attention(HA) module is introduced prior to the scene classification module, where it captures both intra- and inter-correlations among image patches, utilizing both local and global key information from images to restore the absolute camera pose without the need for additional point cloud data. H-AttnAPR efficiently models global dependencies without compromising fine-grained feature information. Therefore, it overcomes the limitations that solely focus on long-range pixel-level feature dependencies within images while neglecting local patches of image feature dependencies. Our approach has been validated on the 7Scenes and Cambridge benchmark datasets. Compared to the baseline algorithm PoseNet, our algorithm has achieved a 41.1% reduction in translation error and a 61.1% decrease in rotation error, demonstrating superior performance in multi-scene absolute camera pose regression.https://ieeexplore.ieee.org/document/10847815/Image processingpose estimationattention mechanismsfeature extractiondeep learning
spellingShingle Xinhua Lu
Jingui Miao
Qingji Xue
Hui Wan
Hao Zhang
Camera Absolute Pose Estimation Using Hierarchical Attention in Multi-Scene
IEEE Access
Image processing
pose estimation
attention mechanisms
feature extraction
deep learning
title Camera Absolute Pose Estimation Using Hierarchical Attention in Multi-Scene
title_full Camera Absolute Pose Estimation Using Hierarchical Attention in Multi-Scene
title_fullStr Camera Absolute Pose Estimation Using Hierarchical Attention in Multi-Scene
title_full_unstemmed Camera Absolute Pose Estimation Using Hierarchical Attention in Multi-Scene
title_short Camera Absolute Pose Estimation Using Hierarchical Attention in Multi-Scene
title_sort camera absolute pose estimation using hierarchical attention in multi scene
topic Image processing
pose estimation
attention mechanisms
feature extraction
deep learning
url https://ieeexplore.ieee.org/document/10847815/
work_keys_str_mv AT xinhualu cameraabsoluteposeestimationusinghierarchicalattentioninmultiscene
AT jinguimiao cameraabsoluteposeestimationusinghierarchicalattentioninmultiscene
AT qingjixue cameraabsoluteposeestimationusinghierarchicalattentioninmultiscene
AT huiwan cameraabsoluteposeestimationusinghierarchicalattentioninmultiscene
AT haozhang cameraabsoluteposeestimationusinghierarchicalattentioninmultiscene