SED-YOLO based multi-scale attention for small object detection in remote sensing

Abstract Object detection is crucial for remote sensing image processing, yet the detection of small objects remains highly challenging due to factors such as image noise and cluttered backgrounds. In response to this challenge, this paper proposes an improved network, named SED-YOLO, based on YOLOv...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaotan Wei, Zhensong Li, Yutong Wang
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-87199-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832585688429625344
author Xiaotan Wei
Zhensong Li
Yutong Wang
author_facet Xiaotan Wei
Zhensong Li
Yutong Wang
author_sort Xiaotan Wei
collection DOAJ
description Abstract Object detection is crucial for remote sensing image processing, yet the detection of small objects remains highly challenging due to factors such as image noise and cluttered backgrounds. In response to this challenge, this paper proposes an improved network, named SED-YOLO, based on YOLOv5s. Firstly, we leverage Switchable Atrous Convolution (SAC) to replace the standard convolutions in the original C3 modules of the backbone network, thereby enhancing feature extraction capabilities and adaptability. Additionally, we introduce the Efficient Multi-Scale Attention(EMA) mechanism at the end of the backbone network to enable efficient multi-scale feature learning, which reduces computational costs while preserving crucial information. In the Neck section, an adaptive Concat method is designed to dynamically adjust the feature fusion strategy according to image content and object characteristics, strengthening the model’s ability to handle diverse objects. Lastly, the three-scale feature detection head is expanded to four by adding a small object detection layer, and incorporating the Dynamic Head(DyHead) module. This enhances the detection head’s expressive power by dynamically adjusting attention weights in feature maps. Experimental results demonstrate that this improved network achieves an mean Average Precision (mAP) of 71.6% on the DOTA dataset, surpassing the original YOLOv5s by 2.4%, effectively improving the accuracy of small object detection.
format Article
id doaj-art-18b29d2cc2574f8ca1698aabe00e1831
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-18b29d2cc2574f8ca1698aabe00e18312025-01-26T12:34:17ZengNature PortfolioScientific Reports2045-23222025-01-0115111110.1038/s41598-025-87199-xSED-YOLO based multi-scale attention for small object detection in remote sensingXiaotan Wei0Zhensong Li1Yutong Wang2Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology UniversityKey Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology UniversityKey Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology UniversityAbstract Object detection is crucial for remote sensing image processing, yet the detection of small objects remains highly challenging due to factors such as image noise and cluttered backgrounds. In response to this challenge, this paper proposes an improved network, named SED-YOLO, based on YOLOv5s. Firstly, we leverage Switchable Atrous Convolution (SAC) to replace the standard convolutions in the original C3 modules of the backbone network, thereby enhancing feature extraction capabilities and adaptability. Additionally, we introduce the Efficient Multi-Scale Attention(EMA) mechanism at the end of the backbone network to enable efficient multi-scale feature learning, which reduces computational costs while preserving crucial information. In the Neck section, an adaptive Concat method is designed to dynamically adjust the feature fusion strategy according to image content and object characteristics, strengthening the model’s ability to handle diverse objects. Lastly, the three-scale feature detection head is expanded to four by adding a small object detection layer, and incorporating the Dynamic Head(DyHead) module. This enhances the detection head’s expressive power by dynamically adjusting attention weights in feature maps. Experimental results demonstrate that this improved network achieves an mean Average Precision (mAP) of 71.6% on the DOTA dataset, surpassing the original YOLOv5s by 2.4%, effectively improving the accuracy of small object detection.https://doi.org/10.1038/s41598-025-87199-xRemote sensingObject detectionYOLOAttention mechanism
spellingShingle Xiaotan Wei
Zhensong Li
Yutong Wang
SED-YOLO based multi-scale attention for small object detection in remote sensing
Scientific Reports
Remote sensing
Object detection
YOLO
Attention mechanism
title SED-YOLO based multi-scale attention for small object detection in remote sensing
title_full SED-YOLO based multi-scale attention for small object detection in remote sensing
title_fullStr SED-YOLO based multi-scale attention for small object detection in remote sensing
title_full_unstemmed SED-YOLO based multi-scale attention for small object detection in remote sensing
title_short SED-YOLO based multi-scale attention for small object detection in remote sensing
title_sort sed yolo based multi scale attention for small object detection in remote sensing
topic Remote sensing
Object detection
YOLO
Attention mechanism
url https://doi.org/10.1038/s41598-025-87199-x
work_keys_str_mv AT xiaotanwei sedyolobasedmultiscaleattentionforsmallobjectdetectioninremotesensing
AT zhensongli sedyolobasedmultiscaleattentionforsmallobjectdetectioninremotesensing
AT yutongwang sedyolobasedmultiscaleattentionforsmallobjectdetectioninremotesensing