River floating object detection with transformer model in real time

Abstract The DEtection TRansformer (DETR) and the YOLO series have been at the forefront of advancements in object detection. The RT-DETR, a member of the DETR family, has notably addressed the speed limitations of its predecessors by utilizing a high-performance hybrid encoder that optimizes query...

Full description

Saved in:
Bibliographic Details
Main Authors: Chong Zhang, Jie Yue, Jianglong Fu, Shouluan Wu
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-93659-1
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract The DEtection TRansformer (DETR) and the YOLO series have been at the forefront of advancements in object detection. The RT-DETR, a member of the DETR family, has notably addressed the speed limitations of its predecessors by utilizing a high-performance hybrid encoder that optimizes query selection. Building upon this foundation, we introduce the LR-DETR, a lightweight evolution of RT-DETR for river floating object detection. This model incorporates the High-level Screening-feature Path Aggregation Network (HS-PAN), which refines feature fusion through a novel bottom-up fusion path, significantly enhancing its expressive power. Further innovation is evident in the introduction of the Residual Partial Convolutional Network (RPCN) as the backbone, which selectively applies convolutions to key channels, leveraging the concept of residuals to reduce computational redundancy and enhance accuracy. The enhancement of the RepBlock with Conv3XCBlock, along with the integration of a parameter-free attention mechanism within the convolutional layers, underscores our commitment to efficiency, ensuring that the model prioritizes valuable information while suppressing redundancy. A comparative analysis with existing detection models not only validates the effectiveness of our approach but also highlights its superiority and adaptability. Our experimental findings are compelling: LR-DETR achieves a 5% increase in mean Average Precision (mAP) at an Intersection over Union (IoU) threshold of 0.5, a 25.8% reduction in parameter count, and a 22.8% decrease in GFLOPs, compared to the RT-DETR algorithm. These improvements are particularly pronounced in the real-time detection of river floating objects, showcasing LR-DETR’s potential in specific environmental monitoring scenarios. The project page: https://github.com/zcfanhua/LR-DETR .
ISSN:2045-2322