An Enhanced End-to-End Object Detector for Drone Aerial Imagery

DETR-like detectors have gained increasing popularity in current practical applications. However, we observe that their pipeline still suffer from several challenges, including unbalanced distribution of positive and negative samples, low-quality initial prediction boxes, and unreasonable gradient s...

Full description

Saved in:
Bibliographic Details
Main Authors: Quan Yu, Qiang Tong, Lin Miao, Lin Qi, Xiulei Liu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10851266/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576759056302080
author Quan Yu
Qiang Tong
Lin Miao
Lin Qi
Xiulei Liu
author_facet Quan Yu
Qiang Tong
Lin Miao
Lin Qi
Xiulei Liu
author_sort Quan Yu
collection DOAJ
description DETR-like detectors have gained increasing popularity in current practical applications. However, we observe that their pipeline still suffer from several challenges, including unbalanced distribution of positive and negative samples, low-quality initial prediction boxes, and unreasonable gradient structure in the decoding stage. These challenges hinder both the convergence speed and detection performance of the model. To address these issues, we propose an enhanced DETR-like model called EM-DETR. It combines three innovative methods, including Dynamic Groups Assignment, Mixed Query Re-Selection, and Look Forward Stage. Dynamic Groups Assignment employs adaptive parameters to balance the number of positive and negative samples, providing more effective supervision signals for ground-truth boxes. Mixed Query Re-Selection utilizes high-quality bounding boxes regressed by subnet to initialize decoder queries, offering superior prior information for the decoder. Look Forward Stage introduces a more rational gradient structure which eliminates inter-layer information bias between decoders. We conduct extensive experiments to evaluate the effectiveness of our proposed method. On VisDrone2021-DET, EM-DETR with ResNet50 achieved 23.9% AP after 12 epochs of training. Compared to the baseline, this represents an improvement of 4.7% AP. Moreover, the excellent performance of EM-DETR on AI-TOD and Crowdhuman proves the generalization capability of the proposed method.
format Article
id doaj-art-5f8503bac56d4568bcab8f3a709f291e
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-5f8503bac56d4568bcab8f3a709f291e2025-01-31T00:00:54ZengIEEEIEEE Access2169-35362025-01-0113187981881310.1109/ACCESS.2025.353303710851266An Enhanced End-to-End Object Detector for Drone Aerial ImageryQuan Yu0https://orcid.org/0009-0007-2635-7322Qiang Tong1https://orcid.org/0000-0003-0508-0727Lin Miao2Lin Qi3Xiulei Liu4https://orcid.org/0000-0002-9303-3682Beijing Advanced Innovation Center for Materials Genome Engineering, Beijing Information Science and Technology University, Beijing, ChinaBeijing Advanced Innovation Center for Materials Genome Engineering, Beijing Information Science and Technology University, Beijing, ChinaBeijing Advanced Innovation Center for Materials Genome Engineering, Beijing Information Science and Technology University, Beijing, ChinaSchool of Economics and Management, Beijing Information Science and Technology University, Beijing, ChinaBeijing Advanced Innovation Center for Materials Genome Engineering, Beijing Information Science and Technology University, Beijing, ChinaDETR-like detectors have gained increasing popularity in current practical applications. However, we observe that their pipeline still suffer from several challenges, including unbalanced distribution of positive and negative samples, low-quality initial prediction boxes, and unreasonable gradient structure in the decoding stage. These challenges hinder both the convergence speed and detection performance of the model. To address these issues, we propose an enhanced DETR-like model called EM-DETR. It combines three innovative methods, including Dynamic Groups Assignment, Mixed Query Re-Selection, and Look Forward Stage. Dynamic Groups Assignment employs adaptive parameters to balance the number of positive and negative samples, providing more effective supervision signals for ground-truth boxes. Mixed Query Re-Selection utilizes high-quality bounding boxes regressed by subnet to initialize decoder queries, offering superior prior information for the decoder. Look Forward Stage introduces a more rational gradient structure which eliminates inter-layer information bias between decoders. We conduct extensive experiments to evaluate the effectiveness of our proposed method. On VisDrone2021-DET, EM-DETR with ResNet50 achieved 23.9% AP after 12 epochs of training. Compared to the baseline, this represents an improvement of 4.7% AP. Moreover, the excellent performance of EM-DETR on AI-TOD and Crowdhuman proves the generalization capability of the proposed method.https://ieeexplore.ieee.org/document/10851266/Drone aerial imageryobject detectionend-to-end object detectordetection transformer
spellingShingle Quan Yu
Qiang Tong
Lin Miao
Lin Qi
Xiulei Liu
An Enhanced End-to-End Object Detector for Drone Aerial Imagery
IEEE Access
Drone aerial imagery
object detection
end-to-end object detector
detection transformer
title An Enhanced End-to-End Object Detector for Drone Aerial Imagery
title_full An Enhanced End-to-End Object Detector for Drone Aerial Imagery
title_fullStr An Enhanced End-to-End Object Detector for Drone Aerial Imagery
title_full_unstemmed An Enhanced End-to-End Object Detector for Drone Aerial Imagery
title_short An Enhanced End-to-End Object Detector for Drone Aerial Imagery
title_sort enhanced end to end object detector for drone aerial imagery
topic Drone aerial imagery
object detection
end-to-end object detector
detection transformer
url https://ieeexplore.ieee.org/document/10851266/
work_keys_str_mv AT quanyu anenhancedendtoendobjectdetectorfordroneaerialimagery
AT qiangtong anenhancedendtoendobjectdetectorfordroneaerialimagery
AT linmiao anenhancedendtoendobjectdetectorfordroneaerialimagery
AT linqi anenhancedendtoendobjectdetectorfordroneaerialimagery
AT xiuleiliu anenhancedendtoendobjectdetectorfordroneaerialimagery
AT quanyu enhancedendtoendobjectdetectorfordroneaerialimagery
AT qiangtong enhancedendtoendobjectdetectorfordroneaerialimagery
AT linmiao enhancedendtoendobjectdetectorfordroneaerialimagery
AT linqi enhancedendtoendobjectdetectorfordroneaerialimagery
AT xiuleiliu enhancedendtoendobjectdetectorfordroneaerialimagery