Multi-scale feature fusion and feature calibration with edge information enhancement for remote sensing object detection

Abstract Vision Transformer-based detectors have achieved remarkable success in the field of object detection, but the application of these models to high-resolution remote sensing imagery faces challenges in computational costs and performance bottlenecks due to the increased computational complexi...

Full description

Saved in:
Bibliographic Details
Main Authors: Lihua Yang, Yi Gu, Hao Feng
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-99835-7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Vision Transformer-based detectors have achieved remarkable success in the field of object detection, but the application of these models to high-resolution remote sensing imagery faces challenges in computational costs and performance bottlenecks due to the increased computational complexity required to process high-resolution imagery, especially when capturing fine-grained edge features. Therefore, there is significant potential for performance optimization. To address these challenges, we propose an improved EMF-DETR based on RT-DERT-ResNet-18. EMF-DETR introduces a multi-scale edge-aware feature extraction network named MEFE-Net. The network improves object recognition and localization capabilities by extracting multi-scale features and enhancing edge information for targets at each scale, demonstrating exceptional performance in small object detection. To further enhance feature representation, the model introduces the CSFCN method, which adaptively adjusts contextual information and precisely calibrates spatial features, ensuring accurate alignment and optimization of features across different scales. In evaluations on the VisDrone2019 dataset, the proposed method achieved a 2.0% improvement in mAP compared to the baseline model, with increases of 1.5% and 2.6% in small (APS) and medium (APM) object detection respectively. Meanwhile, the number of parameters was reduced by 20.22%, demonstrating not only improved detection accuracy but also lower computational cost, highlighting its practical application potential in remote sensing image analysis.
ISSN:2045-2322