SEMA-YOLO: Lightweight Small Object Detection in Remote Sensing Image via Shallow-Layer Enhancement and Multi-Scale Adaptation
Small object detection remains a challenge in the remote sensing field due to feature loss during downsampling and interference from complex backgrounds. A novel network, termed SEMA-YOLO, is proposed in this paper as an enhanced YOLOv11-based framework incorporating three technical advancements. By...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/11/1917 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Small object detection remains a challenge in the remote sensing field due to feature loss during downsampling and interference from complex backgrounds. A novel network, termed SEMA-YOLO, is proposed in this paper as an enhanced YOLOv11-based framework incorporating three technical advancements. By fundamentally reducing information loss and incorporating a cross-scale feature fusion mechanism, the proposed framework significantly enhances small object detection performance. First, the Shallow Layer Enhancement (SLE) strategy reduces backbone depth and introduces small-object detection heads, thereby increasing feature map size and improving small object detection performance. Then, the Global Context Pooling-enhanced Adaptively Spatial Feature Fusion (GCP-ASFF) architecture is designed to optimize cross-scale feature interaction across four detection heads. Finally, the RFA-C3k2 module, which integrates Receptive Field Adaptation (RFA) with the C3k2 structure, is introduced to achieve more refined feature extraction. SEMA-YOLO demonstrates significant advantages in complex urban environments and dense target areas, while its generalization capability meets the detection requirements across diverse scenarios. The experimental results show that SEMA-YOLO achieves mAP<sub>50</sub> scores of 72.5% on the RS-STOD dataset and 61.5% on the AI-TOD dataset, surpassing state-of-the-art models. |
|---|---|
| ISSN: | 2072-4292 |