Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification

Abstract Clothes-Changing Person Re-Identification is a challenging problem in computer vision, primarily due to the appearance variations caused by clothing changes across different camera views. This poses significant challenges to traditional person re-identification techniques that rely on cloth...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yongkang Ding, Jiechen Li, Hao Wang, Ziang Liu, Anqi Wang
Format:	Article
Language:	English
Published:	Springer 2024-11-01
Series:	Complex & Intelligent Systems
Subjects:	Person re-identification Clothes-changing scenarios Computer vision Image retrieval
Online Access:	https://doi.org/10.1007/s40747-024-01646-2
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832571154225692672
author	Yongkang Ding Jiechen Li Hao Wang Ziang Liu Anqi Wang
author_facet	Yongkang Ding Jiechen Li Hao Wang Ziang Liu Anqi Wang
author_sort	Yongkang Ding
collection	DOAJ
description	Abstract Clothes-Changing Person Re-Identification is a challenging problem in computer vision, primarily due to the appearance variations caused by clothing changes across different camera views. This poses significant challenges to traditional person re-identification techniques that rely on clothing features. These challenges include the inconsistency of clothing and the difficulty in learning reliable clothing-irrelevant local features. To address this issue, we propose a novel network architecture called the Attention-Enhanced Multimodal Feature Fusion Network (AE-Net). AE-Net effectively mitigates the impact of clothing changes on recognition accuracy by integrating RGB global features, grayscale image features, and clothing-irrelevant features obtained through semantic segmentation. Specifically, global features capture the overall appearance of the person; grayscale image features help eliminate the interference of color in recognition; and clothing-irrelevant features derived from semantic segmentation enforce the model to learn features independent of the person’s clothing. Additionally, we introduce a multi-scale fusion attention mechanism that further enhances the model’s ability to capture both detailed and global structures, thereby improving recognition accuracy and robustness. Extensive experimental results demonstrate that AE-Net outperforms several state-of-the-art methods on the PRCC and LTCC datasets, particularly in scenarios with significant clothing changes. On the PRCC and LTCC datasets, AE-Net achieves Top-1 accuracy rates of 60.4% and 42.9%, respectively.
format	Article
id	doaj-art-2e8ba9afc8fb474984f7eb0306c70674
institution	Kabale University
issn	2199-4536 2198-6053
language	English
publishDate	2024-11-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj-art-2e8ba9afc8fb474984f7eb0306c706742025-02-02T12:49:14ZengSpringerComplex & Intelligent Systems2199-45362198-60532024-11-0111111510.1007/s40747-024-01646-2Attention-enhanced multimodal feature fusion network for clothes-changing person re-identificationYongkang Ding0Jiechen Li1Hao Wang2Ziang Liu3Anqi Wang4College of Computer Science and Technology, Nanjing University of Aeronautics and AstronauticsEngineering in Electrical Engineering, University of southern CaliforniaSchool of Computer Science, Carnegie Mellon UniversityElectrical and Computer Engineering, Carnegie Mellon UniversityCollege of Computer Science and Technology, Nanjing University of Aeronautics and AstronauticsAbstract Clothes-Changing Person Re-Identification is a challenging problem in computer vision, primarily due to the appearance variations caused by clothing changes across different camera views. This poses significant challenges to traditional person re-identification techniques that rely on clothing features. These challenges include the inconsistency of clothing and the difficulty in learning reliable clothing-irrelevant local features. To address this issue, we propose a novel network architecture called the Attention-Enhanced Multimodal Feature Fusion Network (AE-Net). AE-Net effectively mitigates the impact of clothing changes on recognition accuracy by integrating RGB global features, grayscale image features, and clothing-irrelevant features obtained through semantic segmentation. Specifically, global features capture the overall appearance of the person; grayscale image features help eliminate the interference of color in recognition; and clothing-irrelevant features derived from semantic segmentation enforce the model to learn features independent of the person’s clothing. Additionally, we introduce a multi-scale fusion attention mechanism that further enhances the model’s ability to capture both detailed and global structures, thereby improving recognition accuracy and robustness. Extensive experimental results demonstrate that AE-Net outperforms several state-of-the-art methods on the PRCC and LTCC datasets, particularly in scenarios with significant clothing changes. On the PRCC and LTCC datasets, AE-Net achieves Top-1 accuracy rates of 60.4% and 42.9%, respectively.https://doi.org/10.1007/s40747-024-01646-2Person re-identificationClothes-changing scenariosComputer visionImage retrieval
spellingShingle	Yongkang Ding Jiechen Li Hao Wang Ziang Liu Anqi Wang Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification Complex & Intelligent Systems Person re-identification Clothes-changing scenarios Computer vision Image retrieval
title	Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification
title_full	Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification
title_fullStr	Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification
title_full_unstemmed	Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification
title_short	Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification
title_sort	attention enhanced multimodal feature fusion network for clothes changing person re identification
topic	Person re-identification Clothes-changing scenarios Computer vision Image retrieval
url	https://doi.org/10.1007/s40747-024-01646-2
work_keys_str_mv	AT yongkangding attentionenhancedmultimodalfeaturefusionnetworkforclotheschangingpersonreidentification AT jiechenli attentionenhancedmultimodalfeaturefusionnetworkforclotheschangingpersonreidentification AT haowang attentionenhancedmultimodalfeaturefusionnetworkforclotheschangingpersonreidentification AT ziangliu attentionenhancedmultimodalfeaturefusionnetworkforclotheschangingpersonreidentification AT anqiwang attentionenhancedmultimodalfeaturefusionnetworkforclotheschangingpersonreidentification

Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification

Similar Items