Metadata Enriched Multi-Instance Contrastive Learning for High-Quality Facial Skin Visual Representations

Utilizing self-supervised learning to learn meaningful representations from unlabeled data can be a cost-effective strategy, particularly in medical domains where expert labeling incurs high costs. Contrastive learning typically employs a single contrastive relationship based on individual instances...

Full description

Saved in:
Bibliographic Details
Main Authors: Jihyo Kim, Sungchul Kim, Seungwon Seo, Bumsoo Kim, Daejeong Mun, Hoonjae Lee, Sangheum Hwang
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Applied Artificial Intelligence
Online Access:https://www.tandfonline.com/doi/10.1080/08839514.2025.2462389
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Utilizing self-supervised learning to learn meaningful representations from unlabeled data can be a cost-effective strategy, particularly in medical domains where expert labeling incurs high costs. Contrastive learning typically employs a single contrastive relationship based on individual instances. However, depending on the task-related characteristics, such as facial skin images, this approach may be unsuitable for learning useful representations. In this work, we propose an advanced contrastive learning method to learn high-quality facial skin representations that are useful for various downstream applications related to skin disorders, such as wrinkles and pigmentation. Our method leverages metadata to establish effective multi-instance contrastive relationships specifically for facial skin images. To this end, we employ mini-batches, constructed through the integration of multiple contrastive relationships, to enable a model to learn the multifaceted features of facial skin. Using a facial skin image dataset, we demonstrate that the proposed method is effective in classifying facial wrinkles and pigmentation severity compared to conventional contrastive learning. The features learned by the proposed method adapt well to other skin lesion datasets from different sources, demonstrating the transferability of the learned skin representations. Our study highlights the potential of application-specific batch configurations leveraging metadata to enhance the effectiveness of self-supervised learning.
ISSN:0883-9514
1087-6545