Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object Detection
Accurate and efficient object detection in UAV images is a challenging task due to the diversity of target scales and the massive number of small targets. This study investigates the enhancement in the detection head using sparse convolution, demonstrating its effectiveness in achieving an optimal b...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-01-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/15/2/924 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832589163272077312 |
---|---|
author | Guimei Qi Zhihong Yu Jian Song |
author_facet | Guimei Qi Zhihong Yu Jian Song |
author_sort | Guimei Qi |
collection | DOAJ |
description | Accurate and efficient object detection in UAV images is a challenging task due to the diversity of target scales and the massive number of small targets. This study investigates the enhancement in the detection head using sparse convolution, demonstrating its effectiveness in achieving an optimal balance between accuracy and efficiency. Nevertheless, the sparse convolution method encounters challenges related to the inadequate incorporation of global contextual information and exhibits network inflexibility attributable to its fixed mask ratios. To address the above issues, the MFFCESSC-SSD, a novel single-shot detector (SSD) with multi-scale feature fusion and context-enhanced spatial sparse convolution, is proposed in this paper. First, a global context-enhanced group normalization (CE-GN) layer is developed to address the issue of information loss resulting from the convolution process applied exclusively to the masked region. Subsequently, a dynamic masking strategy is designed to determine the optimal mask ratios, thereby ensuring compact foreground coverage that enhances both accuracy and efficiency. Experiments on two datasets (i.e., VisDrone and ARH2000; the latter dataset was created by the researchers) demonstrate that the MFFCESSC-SSD remarkably outperforms the performance of the SSD and numerous conventional object detection algorithms in terms of accuracy and efficiency. |
format | Article |
id | doaj-art-0b0cd3f91899469b81179a7117ab5980 |
institution | Kabale University |
issn | 2076-3417 |
language | English |
publishDate | 2025-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj-art-0b0cd3f91899469b81179a7117ab59802025-01-24T13:21:21ZengMDPI AGApplied Sciences2076-34172025-01-0115292410.3390/app15020924Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object DetectionGuimei Qi0Zhihong Yu1Jian Song2College of Computer Science and Technology, Inner Mongolia Normal University, Hohhot 010022, ChinaCollege of Mechanical and Electrical Engineering, Inner Mongolia Agricultural University, Hohhot 010010, ChinaCollege of Mechanical and Electrical Engineering, Inner Mongolia Agricultural University, Hohhot 010010, ChinaAccurate and efficient object detection in UAV images is a challenging task due to the diversity of target scales and the massive number of small targets. This study investigates the enhancement in the detection head using sparse convolution, demonstrating its effectiveness in achieving an optimal balance between accuracy and efficiency. Nevertheless, the sparse convolution method encounters challenges related to the inadequate incorporation of global contextual information and exhibits network inflexibility attributable to its fixed mask ratios. To address the above issues, the MFFCESSC-SSD, a novel single-shot detector (SSD) with multi-scale feature fusion and context-enhanced spatial sparse convolution, is proposed in this paper. First, a global context-enhanced group normalization (CE-GN) layer is developed to address the issue of information loss resulting from the convolution process applied exclusively to the masked region. Subsequently, a dynamic masking strategy is designed to determine the optimal mask ratios, thereby ensuring compact foreground coverage that enhances both accuracy and efficiency. Experiments on two datasets (i.e., VisDrone and ARH2000; the latter dataset was created by the researchers) demonstrate that the MFFCESSC-SSD remarkably outperforms the performance of the SSD and numerous conventional object detection algorithms in terms of accuracy and efficiency.https://www.mdpi.com/2076-3417/15/2/924UAV image object detectionSSDmulti-scale feature fusioncontext-enhanced spatial sparse convolution |
spellingShingle | Guimei Qi Zhihong Yu Jian Song Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object Detection Applied Sciences UAV image object detection SSD multi-scale feature fusion context-enhanced spatial sparse convolution |
title | Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object Detection |
title_full | Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object Detection |
title_fullStr | Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object Detection |
title_full_unstemmed | Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object Detection |
title_short | Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object Detection |
title_sort | multi scale feature fusion and context enhanced spatial sparse convolution single shot detector for unmanned aerial vehicle image object detection |
topic | UAV image object detection SSD multi-scale feature fusion context-enhanced spatial sparse convolution |
url | https://www.mdpi.com/2076-3417/15/2/924 |
work_keys_str_mv | AT guimeiqi multiscalefeaturefusionandcontextenhancedspatialsparseconvolutionsingleshotdetectorforunmannedaerialvehicleimageobjectdetection AT zhihongyu multiscalefeaturefusionandcontextenhancedspatialsparseconvolutionsingleshotdetectorforunmannedaerialvehicleimageobjectdetection AT jiansong multiscalefeaturefusionandcontextenhancedspatialsparseconvolutionsingleshotdetectorforunmannedaerialvehicleimageobjectdetection |