Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals

Abstract Polysomnography (PSG) is crucial for diagnosing sleep disorders, but manual scoring of PSG is time-consuming and subjective, leading to high variability. While machine-learning models have improved PSG scoring, their clinical use is hindered by the ‘black-box’ nature. In this study, we pres...

Full description

Saved in:
Bibliographic Details
Main Authors: Hyojin Lee, You Rim Choi, Hyun Kyung Lee, Jaemin Jeong, Joopyo Hong, Hyun-Woo Shin, Hyung-Sin Kim
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-024-01378-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832585357845069824
author Hyojin Lee
You Rim Choi
Hyun Kyung Lee
Jaemin Jeong
Joopyo Hong
Hyun-Woo Shin
Hyung-Sin Kim
author_facet Hyojin Lee
You Rim Choi
Hyun Kyung Lee
Jaemin Jeong
Joopyo Hong
Hyun-Woo Shin
Hyung-Sin Kim
author_sort Hyojin Lee
collection DOAJ
description Abstract Polysomnography (PSG) is crucial for diagnosing sleep disorders, but manual scoring of PSG is time-consuming and subjective, leading to high variability. While machine-learning models have improved PSG scoring, their clinical use is hindered by the ‘black-box’ nature. In this study, we present SleepXViT, an automatic sleep staging system using Vision Transformer (ViT) that provides intuitive, consistent explanations by mimicking human ‘visual scoring’. Tested on KISS–a PSG image dataset from 7745 patients across four hospitals–SleepXViT achieved a Macro F1 score of 81.94%, outperforming baseline models and showing robust performances on public datasets SHHS1 and SHHS2. Furthermore, SleepXViT offers well-calibrated confidence scores, enabling expert review for low-confidence predictions, alongside high-resolution heatmaps highlighting essential features and relevance scores for adjacent epochs’ influence on sleep stage predictions. Together, these explanations reinforce the scoring consistency of SleepXViT, making it both reliable and interpretable, thereby facilitating the synergy between the AI model and human scorers in clinical settings.
format Article
id doaj-art-6a05e602155b401193aefabe86d3df3a
institution Kabale University
issn 2398-6352
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj-art-6a05e602155b401193aefabe86d3df3a2025-01-26T12:53:48ZengNature Portfolionpj Digital Medicine2398-63522025-01-018111410.1038/s41746-024-01378-0Explainable vision transformer for automatic visual sleep staging on multimodal PSG signalsHyojin Lee0You Rim Choi1Hyun Kyung Lee2Jaemin Jeong3Joopyo Hong4Hyun-Woo Shin5Hyung-Sin Kim6Graduate School of Data Science, Seoul National UniversityGraduate School of Data Science, Seoul National UniversityObstructive Upper Airway Research (OUaR) Laboratory, Department of Pharmacology, Seoul National University College of MedicineDepartment of Computer Engineering, School of Software, Hallym UniversityGraduate School of Data Science, Seoul National UniversityObstructive Upper Airway Research (OUaR) Laboratory, Department of Pharmacology, Seoul National University College of MedicineGraduate School of Data Science, Seoul National UniversityAbstract Polysomnography (PSG) is crucial for diagnosing sleep disorders, but manual scoring of PSG is time-consuming and subjective, leading to high variability. While machine-learning models have improved PSG scoring, their clinical use is hindered by the ‘black-box’ nature. In this study, we present SleepXViT, an automatic sleep staging system using Vision Transformer (ViT) that provides intuitive, consistent explanations by mimicking human ‘visual scoring’. Tested on KISS–a PSG image dataset from 7745 patients across four hospitals–SleepXViT achieved a Macro F1 score of 81.94%, outperforming baseline models and showing robust performances on public datasets SHHS1 and SHHS2. Furthermore, SleepXViT offers well-calibrated confidence scores, enabling expert review for low-confidence predictions, alongside high-resolution heatmaps highlighting essential features and relevance scores for adjacent epochs’ influence on sleep stage predictions. Together, these explanations reinforce the scoring consistency of SleepXViT, making it both reliable and interpretable, thereby facilitating the synergy between the AI model and human scorers in clinical settings.https://doi.org/10.1038/s41746-024-01378-0
spellingShingle Hyojin Lee
You Rim Choi
Hyun Kyung Lee
Jaemin Jeong
Joopyo Hong
Hyun-Woo Shin
Hyung-Sin Kim
Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals
npj Digital Medicine
title Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals
title_full Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals
title_fullStr Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals
title_full_unstemmed Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals
title_short Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals
title_sort explainable vision transformer for automatic visual sleep staging on multimodal psg signals
url https://doi.org/10.1038/s41746-024-01378-0
work_keys_str_mv AT hyojinlee explainablevisiontransformerforautomaticvisualsleepstagingonmultimodalpsgsignals
AT yourimchoi explainablevisiontransformerforautomaticvisualsleepstagingonmultimodalpsgsignals
AT hyunkyunglee explainablevisiontransformerforautomaticvisualsleepstagingonmultimodalpsgsignals
AT jaeminjeong explainablevisiontransformerforautomaticvisualsleepstagingonmultimodalpsgsignals
AT joopyohong explainablevisiontransformerforautomaticvisualsleepstagingonmultimodalpsgsignals
AT hyunwooshin explainablevisiontransformerforautomaticvisualsleepstagingonmultimodalpsgsignals
AT hyungsinkim explainablevisiontransformerforautomaticvisualsleepstagingonmultimodalpsgsignals