Safe Semi-Supervised Contrastive Learning Using In-Distribution Data as Positive Examples

Semi-supervised learning (SSL) methods have shown promising results in solving many practical problems when only a few labels are available. The existing methods assume that the class distributions of labeled and unlabeled data are equal; however, their performances are significantly degraded in cla...

Full description

Saved in:

Bibliographic Details
Main Authors:	Min Gu Kwak, Hyungu Kahng, Seoung Bum Kim
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Class distribution mismatch self-supervised contrastive learning semi-supervised learning out-of-distribution
Online Access:	https://ieeexplore.ieee.org/document/11016683/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Semi-supervised learning (SSL) methods have shown promising results in solving many practical problems when only a few labels are available. The existing methods assume that the class distributions of labeled and unlabeled data are equal; however, their performances are significantly degraded in class distribution mismatch scenarios where out-of-distribution (OOD) data exist in the unlabeled data. Previous safe SSL studies have addressed this problem by making OOD data less likely to affect training based on labeled data. However, even if the studies effectively filter out the unnecessary OOD data, they can lose the basic information that all data share regardless of class. To this end, we propose to apply a self-supervised contrastive learning (SSCL) approach to fully exploit a large amount of unlabeled data. We also propose a contrastive loss function with a coefficient schedule to aggregate as an anchor the labeled negative examples of the same class into positive examples. To evaluate the performance of the proposed method, we conduct experiments on image classification datasets-CIFAR-10, CIFAR-100, Tiny ImageNet, and CIFAR-100+Tiny ImageNet—under various mismatch ratios. The results show that SSCL significantly improves classification accuracy, and our proposed loss function further enhances the performance, collectively outperforming existing methods by <inline-formula> <tex-math notation="LaTeX">$2\sim 9$ </tex-math></inline-formula>% across various benchmark datasets. The performance gains become more pronounced as dataset complexity increases and remain robust even in challenging cross-dataset scenarios.
ISSN:	2169-3536

Safe Semi-Supervised Contrastive Learning Using In-Distribution Data as Positive Examples

Similar Items