MCRS-YOLO: Multi-Aggregation Cross-Scale Feature Fusion Object Detector for Remote Sensing Images
With the rapid development of deep learning, object detection in remote sensing images has attracted extensive attention. However, remote sensing images typically exhibit the following characteristics: significant variations in object scales, dense small targets, and complex backgrounds. To address...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/13/2204 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | With the rapid development of deep learning, object detection in remote sensing images has attracted extensive attention. However, remote sensing images typically exhibit the following characteristics: significant variations in object scales, dense small targets, and complex backgrounds. To address these challenges, a novel object detection method named MCRS-YOLO is innovatively proposed. Firstly, a Multi-Branch Aggregation (MBA) network is designed to enhance information flow and mitigate challenges caused by insufficient object feature representation. Secondly, we construct a Multi-scale Feature Refinement and Fusion Pyramid Network (MFRFPN) to effectively integrate spatially multi-scale features, thereby augmenting the semantic information of feature maps. Thirdly, a Large Depth-wise Separable Kernel (LDSK) module is proposed to comprehensively capture contextual information while achieving an enlarged effective receptive field. Finally, the Normalized Wasserstein Distance (NWD) is introduced into hybrid loss training to emphasize small object features and suppress background interference. The efficacy and superiority of MCRS-YOLO are rigorously validated through extensive experiments on two publicly available datasets: NWPU VHR-10 and VEDAI. Compared with the baseline YOLOv11, the proposed method demonstrates improvements of 4.0% and 6.7% in mean Average Precision (mAP), which provides an efficient and accurate solution for object detection in remote sensing images. |
|---|---|
| ISSN: | 2072-4292 |