PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs

Abstract By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity betwee...

Full description

Saved in:
Bibliographic Details
Main Authors: Chengrong Yang, Qiwen Jin, Fei Du, Jing Guo, Yujue Zhou
Format: Article
Language:English
Published: Springer 2025-01-01
Series:Complex & Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1007/s40747-024-01717-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571175625031680
author Chengrong Yang
Qiwen Jin
Fei Du
Jing Guo
Yujue Zhou
author_facet Chengrong Yang
Qiwen Jin
Fei Du
Jing Guo
Yujue Zhou
author_sort Chengrong Yang
collection DOAJ
description Abstract By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity between cross-modalities, false negatives in image-text pairs, and domain shift phenomena pose challenges, making it difficult for existing methods to effectively learn the deep semantic relationships between images and text. To address these challenges, we propose a multi-label chest X-ray recognition generalized ZSL framework based on placeholder learning, termed PLZero. Specifically, we first introduce a jointed embedding space learning module (JESL) to encourage the model to better capture the diversity among different labels. Secondly, we propose a hallucinated class generation module (HCG), which generates hallucinated classes by feature diffusion and feature fusion based on the visual and semantic features of seen classes, using these hallucinated classes as placeholders for unseen classes. Finally, we propose a hallucinated class-based prototype learning module (HCPL), which leverages contrastive learning to control the distribution of hallucinated classes around seen classes without significant deviation from the original data, encouraging high dispersion of class prototypes for seen classes to create sufficient space for inserting unseen class samples. Extensive experiments demonstrate that our method exhibits sufficient generalization and achieves the best performance across three classic and challenging chest X-ray datasets: NIH Chest X-ray 14, CheXpert, and ChestX-Det10. Notably, our method outperforms others even when the number of unseen classes exceeds the experimental settings of other methods. The codes are available at: https://github.com/jinqiwen/PLZero .
format Article
id doaj-art-792359295b824ee58acfc248f0c5ee42
institution Kabale University
issn 2199-4536
2198-6053
language English
publishDate 2025-01-01
publisher Springer
record_format Article
series Complex & Intelligent Systems
spelling doaj-art-792359295b824ee58acfc248f0c5ee422025-02-02T12:49:09ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-01-0111111310.1007/s40747-024-01717-4PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographsChengrong Yang0Qiwen Jin1Fei Du2Jing Guo3Yujue Zhou4School of Information Science and Engineering, Yunnan UniversityYunnan Key Laboratory of Software Engineering, Yunnan UniversityYunnan Key Laboratory of Software Engineering, Yunnan UniversityYunnan Key Laboratory of Software Engineering, Yunnan UniversityYunnan Key Laboratory of Software Engineering, Yunnan UniversityAbstract By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity between cross-modalities, false negatives in image-text pairs, and domain shift phenomena pose challenges, making it difficult for existing methods to effectively learn the deep semantic relationships between images and text. To address these challenges, we propose a multi-label chest X-ray recognition generalized ZSL framework based on placeholder learning, termed PLZero. Specifically, we first introduce a jointed embedding space learning module (JESL) to encourage the model to better capture the diversity among different labels. Secondly, we propose a hallucinated class generation module (HCG), which generates hallucinated classes by feature diffusion and feature fusion based on the visual and semantic features of seen classes, using these hallucinated classes as placeholders for unseen classes. Finally, we propose a hallucinated class-based prototype learning module (HCPL), which leverages contrastive learning to control the distribution of hallucinated classes around seen classes without significant deviation from the original data, encouraging high dispersion of class prototypes for seen classes to create sufficient space for inserting unseen class samples. Extensive experiments demonstrate that our method exhibits sufficient generalization and achieves the best performance across three classic and challenging chest X-ray datasets: NIH Chest X-ray 14, CheXpert, and ChestX-Det10. Notably, our method outperforms others even when the number of unseen classes exceeds the experimental settings of other methods. The codes are available at: https://github.com/jinqiwen/PLZero .https://doi.org/10.1007/s40747-024-01717-4Generalized zero-shot learningPlaceholder learningMulti-label recognition
spellingShingle Chengrong Yang
Qiwen Jin
Fei Du
Jing Guo
Yujue Zhou
PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
Complex & Intelligent Systems
Generalized zero-shot learning
Placeholder learning
Multi-label recognition
title PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_full PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_fullStr PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_full_unstemmed PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_short PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_sort plzero placeholder based approach to generalized zero shot learning for multi label recognition in chest radiographs
topic Generalized zero-shot learning
Placeholder learning
Multi-label recognition
url https://doi.org/10.1007/s40747-024-01717-4
work_keys_str_mv AT chengrongyang plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs
AT qiwenjin plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs
AT feidu plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs
AT jingguo plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs
AT yujuezhou plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs