RGB-T Object Detection With Failure Scenarios

Currently, RGB-thermal (RGB-T) object detection algorithms have demonstrated excellent performance, but issues such as modality failure caused by fog, strong light, sensor damage, and other conditions can significantly impact the detector's performance. This article proposes a multimodal...

Full description

Saved in:
Bibliographic Details
Main Authors: Qingwang Wang, Yuxuan Sun, Yongke Chi, Tao Shen
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10817087/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832592943253291008
author Qingwang Wang
Yuxuan Sun
Yongke Chi
Tao Shen
author_facet Qingwang Wang
Yuxuan Sun
Yongke Chi
Tao Shen
author_sort Qingwang Wang
collection DOAJ
description Currently, RGB-thermal (RGB-T) object detection algorithms have demonstrated excellent performance, but issues such as modality failure caused by fog, strong light, sensor damage, and other conditions can significantly impact the detector's performance. This article proposes a multimodal object detection method named diffusion enhanced object detection network (DENet), aiming to address modality failure problems caused by nonroutine environments, sensor anomalies, and other factors, while suppressing redundant information in multimodal data to improve model accuracy. First, we design a multidimensional incremental information generation module based on a diffusion model, which reconstructs the unstable information of RGB-T images through the reverse diffusion process using the original fusion feature map. To further address the issue of redundant information in existing RGB-T object detection models, a redundant information suppression module is introduced, minimizing cross-modal redundant information based on mutual information and contrastive loss. Finally, a kernel similarity-aware illumination module (KSIM) is introduced to dynamically adjust the weighting of RGB and thermal features by incorporating both illumination intensity and the similarity between modalities. KSIM can fine-tune the contribution of each modality during fusion, ensuring a more precise balance that improves recognition performance across diverse conditions. Experimental results on the DroneVehicle and VEDAI datasets show that DENet performs outstandingly in multimodal object detection tasks, effectively improving detection accuracy and reducing the impact of modality failure on performance.
format Article
id doaj-art-9e516df2c41a4c868cb8c6a6db3dbb39
institution Kabale University
issn 1939-1404
2151-1535
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-9e516df2c41a4c868cb8c6a6db3dbb392025-01-21T00:00:31ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-01183000301010.1109/JSTARS.2024.352340810817087RGB-T Object Detection With Failure ScenariosQingwang Wang0https://orcid.org/0000-0001-5820-5357Yuxuan Sun1https://orcid.org/0000-0002-5619-6394Yongke Chi2Tao Shen3https://orcid.org/0000-0003-1273-7950Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, ChinaFaculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, ChinaAutonomous Driving Research and Development Center, BYD Company Limited, Shenzhen, ChinaFaculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, ChinaCurrently, RGB-thermal (RGB-T) object detection algorithms have demonstrated excellent performance, but issues such as modality failure caused by fog, strong light, sensor damage, and other conditions can significantly impact the detector's performance. This article proposes a multimodal object detection method named diffusion enhanced object detection network (DENet), aiming to address modality failure problems caused by nonroutine environments, sensor anomalies, and other factors, while suppressing redundant information in multimodal data to improve model accuracy. First, we design a multidimensional incremental information generation module based on a diffusion model, which reconstructs the unstable information of RGB-T images through the reverse diffusion process using the original fusion feature map. To further address the issue of redundant information in existing RGB-T object detection models, a redundant information suppression module is introduced, minimizing cross-modal redundant information based on mutual information and contrastive loss. Finally, a kernel similarity-aware illumination module (KSIM) is introduced to dynamically adjust the weighting of RGB and thermal features by incorporating both illumination intensity and the similarity between modalities. KSIM can fine-tune the contribution of each modality during fusion, ensuring a more precise balance that improves recognition performance across diverse conditions. Experimental results on the DroneVehicle and VEDAI datasets show that DENet performs outstandingly in multimodal object detection tasks, effectively improving detection accuracy and reducing the impact of modality failure on performance.https://ieeexplore.ieee.org/document/10817087/Diffusion modelkernel methodmultimodal remote sensingobject detectionRGB-thermal (RGB-T) images
spellingShingle Qingwang Wang
Yuxuan Sun
Yongke Chi
Tao Shen
RGB-T Object Detection With Failure Scenarios
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Diffusion model
kernel method
multimodal remote sensing
object detection
RGB-thermal (RGB-T) images
title RGB-T Object Detection With Failure Scenarios
title_full RGB-T Object Detection With Failure Scenarios
title_fullStr RGB-T Object Detection With Failure Scenarios
title_full_unstemmed RGB-T Object Detection With Failure Scenarios
title_short RGB-T Object Detection With Failure Scenarios
title_sort rgb t object detection with failure scenarios
topic Diffusion model
kernel method
multimodal remote sensing
object detection
RGB-thermal (RGB-T) images
url https://ieeexplore.ieee.org/document/10817087/
work_keys_str_mv AT qingwangwang rgbtobjectdetectionwithfailurescenarios
AT yuxuansun rgbtobjectdetectionwithfailurescenarios
AT yongkechi rgbtobjectdetectionwithfailurescenarios
AT taoshen rgbtobjectdetectionwithfailurescenarios