PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs

Abstract By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity betwee...

Full description

Saved in:

Bibliographic Details
Main Authors:	Chengrong Yang, Qiwen Jin, Fei Du, Jing Guo, Yujue Zhou
Format:	Article
Language:	English
Published:	Springer 2025-01-01
Series:	Complex & Intelligent Systems
Subjects:	Generalized zero-shot learning Placeholder learning Multi-label recognition
Online Access:	https://doi.org/10.1007/s40747-024-01717-4
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832571175625031680
author	Chengrong Yang Qiwen Jin Fei Du Jing Guo Yujue Zhou
author_facet	Chengrong Yang Qiwen Jin Fei Du Jing Guo Yujue Zhou
author_sort	Chengrong Yang
collection	DOAJ
description	Abstract By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity between cross-modalities, false negatives in image-text pairs, and domain shift phenomena pose challenges, making it difficult for existing methods to effectively learn the deep semantic relationships between images and text. To address these challenges, we propose a multi-label chest X-ray recognition generalized ZSL framework based on placeholder learning, termed PLZero. Specifically, we first introduce a jointed embedding space learning module (JESL) to encourage the model to better capture the diversity among different labels. Secondly, we propose a hallucinated class generation module (HCG), which generates hallucinated classes by feature diffusion and feature fusion based on the visual and semantic features of seen classes, using these hallucinated classes as placeholders for unseen classes. Finally, we propose a hallucinated class-based prototype learning module (HCPL), which leverages contrastive learning to control the distribution of hallucinated classes around seen classes without significant deviation from the original data, encouraging high dispersion of class prototypes for seen classes to create sufficient space for inserting unseen class samples. Extensive experiments demonstrate that our method exhibits sufficient generalization and achieves the best performance across three classic and challenging chest X-ray datasets: NIH Chest X-ray 14, CheXpert, and ChestX-Det10. Notably, our method outperforms others even when the number of unseen classes exceeds the experimental settings of other methods. The codes are available at: https://github.com/jinqiwen/PLZero .
format	Article
id	doaj-art-792359295b824ee58acfc248f0c5ee42
institution	Kabale University
issn	2199-4536 2198-6053
language	English
publishDate	2025-01-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj-art-792359295b824ee58acfc248f0c5ee422025-02-02T12:49:09ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-01-0111111310.1007/s40747-024-01717-4PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographsChengrong Yang0Qiwen Jin1Fei Du2Jing Guo3Yujue Zhou4School of Information Science and Engineering, Yunnan UniversityYunnan Key Laboratory of Software Engineering, Yunnan UniversityYunnan Key Laboratory of Software Engineering, Yunnan UniversityYunnan Key Laboratory of Software Engineering, Yunnan UniversityYunnan Key Laboratory of Software Engineering, Yunnan UniversityAbstract By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity between cross-modalities, false negatives in image-text pairs, and domain shift phenomena pose challenges, making it difficult for existing methods to effectively learn the deep semantic relationships between images and text. To address these challenges, we propose a multi-label chest X-ray recognition generalized ZSL framework based on placeholder learning, termed PLZero. Specifically, we first introduce a jointed embedding space learning module (JESL) to encourage the model to better capture the diversity among different labels. Secondly, we propose a hallucinated class generation module (HCG), which generates hallucinated classes by feature diffusion and feature fusion based on the visual and semantic features of seen classes, using these hallucinated classes as placeholders for unseen classes. Finally, we propose a hallucinated class-based prototype learning module (HCPL), which leverages contrastive learning to control the distribution of hallucinated classes around seen classes without significant deviation from the original data, encouraging high dispersion of class prototypes for seen classes to create sufficient space for inserting unseen class samples. Extensive experiments demonstrate that our method exhibits sufficient generalization and achieves the best performance across three classic and challenging chest X-ray datasets: NIH Chest X-ray 14, CheXpert, and ChestX-Det10. Notably, our method outperforms others even when the number of unseen classes exceeds the experimental settings of other methods. The codes are available at: https://github.com/jinqiwen/PLZero .https://doi.org/10.1007/s40747-024-01717-4Generalized zero-shot learningPlaceholder learningMulti-label recognition
spellingShingle	Chengrong Yang Qiwen Jin Fei Du Jing Guo Yujue Zhou PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs Complex & Intelligent Systems Generalized zero-shot learning Placeholder learning Multi-label recognition
title	PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_full	PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_fullStr	PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_full_unstemmed	PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_short	PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs
title_sort	plzero placeholder based approach to generalized zero shot learning for multi label recognition in chest radiographs
topic	Generalized zero-shot learning Placeholder learning Multi-label recognition
url	https://doi.org/10.1007/s40747-024-01717-4
work_keys_str_mv	AT chengrongyang plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs AT qiwenjin plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs AT feidu plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs AT jingguo plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs AT yujuezhou plzeroplaceholderbasedapproachtogeneralizedzeroshotlearningformultilabelrecognitioninchestradiographs

PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs

Similar Items