An Investigation of the Domain Gap in CLIP-Based Person Re-Identification

Person re-identification (re-id) is a critical computer vision task aimed at identifying individuals across multiple non-overlapping cameras, with wide-ranging applications in intelligent surveillance systems. Despite recent advances, the domain gap—performance degradation when models encounter unse...

Full description

Saved in:
Bibliographic Details
Main Authors: Andrea Asperti, Leonardo Naldi, Salvatore Fiorilla
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/2/363
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832587533208256512
author Andrea Asperti
Leonardo Naldi
Salvatore Fiorilla
author_facet Andrea Asperti
Leonardo Naldi
Salvatore Fiorilla
author_sort Andrea Asperti
collection DOAJ
description Person re-identification (re-id) is a critical computer vision task aimed at identifying individuals across multiple non-overlapping cameras, with wide-ranging applications in intelligent surveillance systems. Despite recent advances, the domain gap—performance degradation when models encounter unseen datasets—remains a critical challenge. CLIP-based models, leveraging multimodal pre-training, offer potential for mitigating this issue by aligning visual and textual representations. In this study, we provide a comprehensive quantitative analysis of the domain gap in CLIP-based re-id systems across standard benchmarks, including Market-1501, DukeMTMC-reID, MSMT17, and Airport, simulating real-world deployment conditions. We systematically measure the performance of these models in terms of mean average precision (mAP) and Rank-1 accuracy, offering insights into the challenges faced during dataset transitions. Our analysis highlights the specific advantages introduced by CLIP’s visual–textual alignment and evaluates its contribution relative to strong image encoder baselines. Additionally, we evaluate the impact of extending training sets with non-domain-specific data and incorporating random erasing augmentation, achieving an average improvement of +4.3% in mAP and +4.0% in Rank-1 accuracy. Our findings underscore the importance of standardized benchmarks and systematic evaluations for enhancing reproducibility and guiding future research. This work contributes to a deeper understanding of the domain gap in re-id, while highlighting pathways for improving model robustness and generalization in diverse, real-world scenarios.
format Article
id doaj-art-df6e925fbc4c41a993bbec838aa7004e
institution Kabale University
issn 1424-8220
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-df6e925fbc4c41a993bbec838aa7004e2025-01-24T13:48:39ZengMDPI AGSensors1424-82202025-01-0125236310.3390/s25020363An Investigation of the Domain Gap in CLIP-Based Person Re-IdentificationAndrea Asperti0Leonardo Naldi1Salvatore Fiorilla2Department of Informatics—Science and Engineering (DISI), University of Bologna, 40126 Bologna, ItalyDepartment of Informatics—Science and Engineering (DISI), University of Bologna, 40126 Bologna, ItalyDepartment of Informatics—Science and Engineering (DISI), University of Bologna, 40126 Bologna, ItalyPerson re-identification (re-id) is a critical computer vision task aimed at identifying individuals across multiple non-overlapping cameras, with wide-ranging applications in intelligent surveillance systems. Despite recent advances, the domain gap—performance degradation when models encounter unseen datasets—remains a critical challenge. CLIP-based models, leveraging multimodal pre-training, offer potential for mitigating this issue by aligning visual and textual representations. In this study, we provide a comprehensive quantitative analysis of the domain gap in CLIP-based re-id systems across standard benchmarks, including Market-1501, DukeMTMC-reID, MSMT17, and Airport, simulating real-world deployment conditions. We systematically measure the performance of these models in terms of mean average precision (mAP) and Rank-1 accuracy, offering insights into the challenges faced during dataset transitions. Our analysis highlights the specific advantages introduced by CLIP’s visual–textual alignment and evaluates its contribution relative to strong image encoder baselines. Additionally, we evaluate the impact of extending training sets with non-domain-specific data and incorporating random erasing augmentation, achieving an average improvement of +4.3% in mAP and +4.0% in Rank-1 accuracy. Our findings underscore the importance of standardized benchmarks and systematic evaluations for enhancing reproducibility and guiding future research. This work contributes to a deeper understanding of the domain gap in re-id, while highlighting pathways for improving model robustness and generalization in diverse, real-world scenarios.https://www.mdpi.com/1424-8220/25/2/363person re-identificationdomain gapCLIPdeep learningcomputer vision
spellingShingle Andrea Asperti
Leonardo Naldi
Salvatore Fiorilla
An Investigation of the Domain Gap in CLIP-Based Person Re-Identification
Sensors
person re-identification
domain gap
CLIP
deep learning
computer vision
title An Investigation of the Domain Gap in CLIP-Based Person Re-Identification
title_full An Investigation of the Domain Gap in CLIP-Based Person Re-Identification
title_fullStr An Investigation of the Domain Gap in CLIP-Based Person Re-Identification
title_full_unstemmed An Investigation of the Domain Gap in CLIP-Based Person Re-Identification
title_short An Investigation of the Domain Gap in CLIP-Based Person Re-Identification
title_sort investigation of the domain gap in clip based person re identification
topic person re-identification
domain gap
CLIP
deep learning
computer vision
url https://www.mdpi.com/1424-8220/25/2/363
work_keys_str_mv AT andreaasperti aninvestigationofthedomaingapinclipbasedpersonreidentification
AT leonardonaldi aninvestigationofthedomaingapinclipbasedpersonreidentification
AT salvatorefiorilla aninvestigationofthedomaingapinclipbasedpersonreidentification
AT andreaasperti investigationofthedomaingapinclipbasedpersonreidentification
AT leonardonaldi investigationofthedomaingapinclipbasedpersonreidentification
AT salvatorefiorilla investigationofthedomaingapinclipbasedpersonreidentification