Detection of water surface targets based on improved Deformable DETR

Objective With technological advancements and the increasing demand for water resource exploration, water surface target detection plays a crucial role in various applications, such as ship navigation and maritime safety. However, conventional detection methods encounter several challenges, and exis...

Full description

Saved in:
Bibliographic Details
Main Authors: Pengjiu WANG, Junbin Gong, Wei LUO, Xiao HUANG, Junjie GUO
Format: Article
Language:English
Published: Editorial Office of Chinese Journal of Ship Research 2025-06-01
Series:Zhongguo Jianchuan Yanjiu
Subjects:
Online Access:http://www.ship-research.com/en/article/doi/10.19693/j.issn.1673-3185.03645
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849422898811371520
author Pengjiu WANG
Junbin Gong
Wei LUO
Xiao HUANG
Junjie GUO
author_facet Pengjiu WANG
Junbin Gong
Wei LUO
Xiao HUANG
Junjie GUO
author_sort Pengjiu WANG
collection DOAJ
description Objective With technological advancements and the increasing demand for water resource exploration, water surface target detection plays a crucial role in various applications, such as ship navigation and maritime safety. However, conventional detection methods encounter several challenges, and existing deep-learning-based algorithms have limitations in this field, including limited datasets and insufficient detection speed even after improvement. This study aims to develop an improved object-detection algorithm based on Deformable DETR for automatic recognition of water surface targets. The algorithm is designed to significantly enhance the inference and training speed of the model while improving detection accuracy, thus achieving more efficient and robust detection of water surface targets. Methods Firstly, a new water surface target dataset was constructed. Then, the original feature-extraction network of Deformable DETR was replaced with the lightweight MobileNetV3. MobileNetV3 is available in multiple versions and is a lightweight network with high recognition accuracy and small model parameters. MobileNetV3-Small version was chosen as the feature-extraction backbone. It has a series of operations, including depth-separable convolution, as well as SE modules and the Hard-swish activation function. To further reduce the model size and enhance the detection ability, three output feature maps from specific modules of MobileNetV3-Small were directly used for multi-scale feature extraction. Secondly, the CBAM attention mechanism module was introduced. CBAM is a lightweight yet versatile module that integrates both channel attention and spatial attention, allowing it to be seamlessly incorporated into the network. By replacing the SE module in MobileNetV3 with CBAM, the model's feature extraction capability was further improved. CBAM's channel attention module applies both average pooling and max pooling to the input feature map, followed by a shared neural network and a Sigmoid function to generate channel-attention features. The spatial attention module first applies pooling operations along the channel dimension of the feature map refined by the channel attention module, followed by convolution and Sigmoid activation to obtain spatial attention features. Finally, the improved Deformable DETR network was obtained by integrating MobileNetV3 and the CBAM attention mechanism module. The input image processed by the the MobileNetV3-Small network embedded with CBAM, from which three multi-scale feature maps are extracted. These feature maps are further refined and then fed into the Transformer structure of the Deformable DETR for further processing. Results Ablation experiments were carried out on the self-constructed dataset and the ABOships dataset. On the self-constructed dataset, compared with the original Deformable DETR model, the improved algorithm reduced the model's parameter count and size to about one-third. The mAP0.5:0.95 increased by 2.4%. Training time was reduced to 41.7% of that required by the original algorithm. On the ABOships dataset, the mAP0.5:0.95 increased by 7.5%, and the training time was reduced to 51.9% of that required by the original. During training, the model's loss function exhibited faster and more stable convergence. In the comparison tests with other common algorithms (YOLOv3, Faster R-CNN, Mask R-CNN) on the ABOships dataset, the improved algorithm demonstrated superior performance. For mAP0.5, it reached 50.0%, higher than the other algorithms. For mAP0.5:0.95, it was 21.7%, leading in fine-grained detection. The model's parameter count was only 12.9 M, much lower than other models, indicating high parameter efficiency. Although the frame rate was slightly lower than that of YOLOv3 and Faster R-CNN, it was significantly higher than that of Mask R-CNN, maintaining a reasonable processing speed while ensuring high detection accuracy.ConclusionsThe improved Deformable DETR algorithm proposed in this paper effectively improves the performance of water surface target detection. It substantially reduces the model's parameter count and storage footprint, accelerates the training and inference speed, and enhances the recognition accuracy. The experimental results on different datasets verify the effectiveness of the algorithm. This study explores a novel approach to applying DETR-class algorithms in water surface target detection, indicating their potential in this field.
format Article
id doaj-art-10f73368ab7d4b7cb7d4fea3a5d14d24
institution Kabale University
issn 1673-3185
language English
publishDate 2025-06-01
publisher Editorial Office of Chinese Journal of Ship Research
record_format Article
series Zhongguo Jianchuan Yanjiu
spelling doaj-art-10f73368ab7d4b7cb7d4fea3a5d14d242025-08-20T03:30:52ZengEditorial Office of Chinese Journal of Ship ResearchZhongguo Jianchuan Yanjiu1673-31852025-06-0120330531710.19693/j.issn.1673-3185.03645ZG3645Detection of water surface targets based on improved Deformable DETRPengjiu WANG0Junbin Gong1Wei LUO2Xiao HUANG3Junjie GUO4China Ship Development and Design Center, Wuhan 430064, ChinaHanjiang National Laboratory, Wuhan 430060, ChinaChina Ship Development and Design Center, Wuhan 430064, ChinaChina Ship Development and Design Center, Wuhan 430064, ChinaChina Ship Development and Design Center, Wuhan 430064, ChinaObjective With technological advancements and the increasing demand for water resource exploration, water surface target detection plays a crucial role in various applications, such as ship navigation and maritime safety. However, conventional detection methods encounter several challenges, and existing deep-learning-based algorithms have limitations in this field, including limited datasets and insufficient detection speed even after improvement. This study aims to develop an improved object-detection algorithm based on Deformable DETR for automatic recognition of water surface targets. The algorithm is designed to significantly enhance the inference and training speed of the model while improving detection accuracy, thus achieving more efficient and robust detection of water surface targets. Methods Firstly, a new water surface target dataset was constructed. Then, the original feature-extraction network of Deformable DETR was replaced with the lightweight MobileNetV3. MobileNetV3 is available in multiple versions and is a lightweight network with high recognition accuracy and small model parameters. MobileNetV3-Small version was chosen as the feature-extraction backbone. It has a series of operations, including depth-separable convolution, as well as SE modules and the Hard-swish activation function. To further reduce the model size and enhance the detection ability, three output feature maps from specific modules of MobileNetV3-Small were directly used for multi-scale feature extraction. Secondly, the CBAM attention mechanism module was introduced. CBAM is a lightweight yet versatile module that integrates both channel attention and spatial attention, allowing it to be seamlessly incorporated into the network. By replacing the SE module in MobileNetV3 with CBAM, the model's feature extraction capability was further improved. CBAM's channel attention module applies both average pooling and max pooling to the input feature map, followed by a shared neural network and a Sigmoid function to generate channel-attention features. The spatial attention module first applies pooling operations along the channel dimension of the feature map refined by the channel attention module, followed by convolution and Sigmoid activation to obtain spatial attention features. Finally, the improved Deformable DETR network was obtained by integrating MobileNetV3 and the CBAM attention mechanism module. The input image processed by the the MobileNetV3-Small network embedded with CBAM, from which three multi-scale feature maps are extracted. These feature maps are further refined and then fed into the Transformer structure of the Deformable DETR for further processing. Results Ablation experiments were carried out on the self-constructed dataset and the ABOships dataset. On the self-constructed dataset, compared with the original Deformable DETR model, the improved algorithm reduced the model's parameter count and size to about one-third. The mAP0.5:0.95 increased by 2.4%. Training time was reduced to 41.7% of that required by the original algorithm. On the ABOships dataset, the mAP0.5:0.95 increased by 7.5%, and the training time was reduced to 51.9% of that required by the original. During training, the model's loss function exhibited faster and more stable convergence. In the comparison tests with other common algorithms (YOLOv3, Faster R-CNN, Mask R-CNN) on the ABOships dataset, the improved algorithm demonstrated superior performance. For mAP0.5, it reached 50.0%, higher than the other algorithms. For mAP0.5:0.95, it was 21.7%, leading in fine-grained detection. The model's parameter count was only 12.9 M, much lower than other models, indicating high parameter efficiency. Although the frame rate was slightly lower than that of YOLOv3 and Faster R-CNN, it was significantly higher than that of Mask R-CNN, maintaining a reasonable processing speed while ensuring high detection accuracy.ConclusionsThe improved Deformable DETR algorithm proposed in this paper effectively improves the performance of water surface target detection. It substantially reduces the model's parameter count and storage footprint, accelerates the training and inference speed, and enhances the recognition accuracy. The experimental results on different datasets verify the effectiveness of the algorithm. This study explores a novel approach to applying DETR-class algorithms in water surface target detection, indicating their potential in this field.http://www.ship-research.com/en/article/doi/10.19693/j.issn.1673-3185.03645water surface targettarget detectionperformance optimizationtarget trackingautomatic target recognition
spellingShingle Pengjiu WANG
Junbin Gong
Wei LUO
Xiao HUANG
Junjie GUO
Detection of water surface targets based on improved Deformable DETR
Zhongguo Jianchuan Yanjiu
water surface target
target detection
performance optimization
target tracking
automatic target recognition
title Detection of water surface targets based on improved Deformable DETR
title_full Detection of water surface targets based on improved Deformable DETR
title_fullStr Detection of water surface targets based on improved Deformable DETR
title_full_unstemmed Detection of water surface targets based on improved Deformable DETR
title_short Detection of water surface targets based on improved Deformable DETR
title_sort detection of water surface targets based on improved deformable detr
topic water surface target
target detection
performance optimization
target tracking
automatic target recognition
url http://www.ship-research.com/en/article/doi/10.19693/j.issn.1673-3185.03645
work_keys_str_mv AT pengjiuwang detectionofwatersurfacetargetsbasedonimproveddeformabledetr
AT junbingong detectionofwatersurfacetargetsbasedonimproveddeformabledetr
AT weiluo detectionofwatersurfacetargetsbasedonimproveddeformabledetr
AT xiaohuang detectionofwatersurfacetargetsbasedonimproveddeformabledetr
AT junjieguo detectionofwatersurfacetargetsbasedonimproveddeformabledetr