Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer
Introduction. Clinical datasets for training deep learning (DL) models often exhibit high levels of heterogeneity due to differences such as patient characteristics, new medical techniques, and physician preferences. In recent years, hydrogel spacers have been used in some prostate cancer patients r...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IOP Publishing
2025-01-01
|
Series: | Machine Learning: Science and Technology |
Subjects: | |
Online Access: | https://doi.org/10.1088/2632-2153/ada8f3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832590923758829568 |
---|---|
author | Hengrui Zhao Biling Wang Michael Dohopolski Ti Bai Steve Jiang Dan Nguyen |
author_facet | Hengrui Zhao Biling Wang Michael Dohopolski Ti Bai Steve Jiang Dan Nguyen |
author_sort | Hengrui Zhao |
collection | DOAJ |
description | Introduction. Clinical datasets for training deep learning (DL) models often exhibit high levels of heterogeneity due to differences such as patient characteristics, new medical techniques, and physician preferences. In recent years, hydrogel spacers have been used in some prostate cancer patients receiving radiotherapy to separate the prostate and the rectum to better spare the rectum while achieving adequate dose coverage on the prostate. However, this substantially affects the computed tomography image appearance, which downstream reduced the contouring accuracy of auto-segmentation algorithms. This leads to highly heterogeneous dataset. Methods. To address this issue, we propose to identify underlying clusters within the dataset and use the cluster labels for segmentation. We collected a clinical dataset of 909 patients, including those with two types of hydrogel spacers and those without. First, we trained a DL model to locate the prostate and limit our field of view to the local area surrounding the prostate and rectum. We then used Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and employed k-means clustering to assign each patient to a cluster. To leverage this clustered data, we propose a text-guided segmentation model, contrastive language and image pre-training (CLIP)-UNet, which encodes the cluster information using a text encoder and combines the encoded text information with image features for segmentation. Results. The UMAP results indicated up to three clusters within the dataset. CLIP-UNet with cluster information achieved a Dice score of 86.2% compared to 84.4% from the baseline UNet. Additionally, CLIP-UNet outperforms other state-of-the-art models with or without cluster information. Conclusion. Automatic clustering assisted by DL can reveal hidden data clusters in clinical datasets, and CLIP-UNet effectively utilizes clustered labels and achieves higher performance. |
format | Article |
id | doaj-art-e6d79ec8a85e4e8392922ad325fecb1b |
institution | Kabale University |
issn | 2632-2153 |
language | English |
publishDate | 2025-01-01 |
publisher | IOP Publishing |
record_format | Article |
series | Machine Learning: Science and Technology |
spelling | doaj-art-e6d79ec8a85e4e8392922ad325fecb1b2025-01-23T06:30:25ZengIOP PublishingMachine Learning: Science and Technology2632-21532025-01-016101501510.1088/2632-2153/ada8f3Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacerHengrui Zhao0https://orcid.org/0000-0002-6712-5823Biling Wang1https://orcid.org/0000-0002-6894-8645Michael Dohopolski2https://orcid.org/0000-0002-9043-1490Ti Bai3https://orcid.org/0000-0002-6697-7434Steve Jiang4https://orcid.org/0000-0002-3083-6752Dan Nguyen5https://orcid.org/0000-0002-9590-0655Medical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaIntroduction. Clinical datasets for training deep learning (DL) models often exhibit high levels of heterogeneity due to differences such as patient characteristics, new medical techniques, and physician preferences. In recent years, hydrogel spacers have been used in some prostate cancer patients receiving radiotherapy to separate the prostate and the rectum to better spare the rectum while achieving adequate dose coverage on the prostate. However, this substantially affects the computed tomography image appearance, which downstream reduced the contouring accuracy of auto-segmentation algorithms. This leads to highly heterogeneous dataset. Methods. To address this issue, we propose to identify underlying clusters within the dataset and use the cluster labels for segmentation. We collected a clinical dataset of 909 patients, including those with two types of hydrogel spacers and those without. First, we trained a DL model to locate the prostate and limit our field of view to the local area surrounding the prostate and rectum. We then used Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and employed k-means clustering to assign each patient to a cluster. To leverage this clustered data, we propose a text-guided segmentation model, contrastive language and image pre-training (CLIP)-UNet, which encodes the cluster information using a text encoder and combines the encoded text information with image features for segmentation. Results. The UMAP results indicated up to three clusters within the dataset. CLIP-UNet with cluster information achieved a Dice score of 86.2% compared to 84.4% from the baseline UNet. Additionally, CLIP-UNet outperforms other state-of-the-art models with or without cluster information. Conclusion. Automatic clustering assisted by DL can reveal hidden data clusters in clinical datasets, and CLIP-UNet effectively utilizes clustered labels and achieves higher performance.https://doi.org/10.1088/2632-2153/ada8f3medical image segmentationdeep learningunsupervised learningclusteringhydrogel spacer |
spellingShingle | Hengrui Zhao Biling Wang Michael Dohopolski Ti Bai Steve Jiang Dan Nguyen Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer Machine Learning: Science and Technology medical image segmentation deep learning unsupervised learning clustering hydrogel spacer |
title | Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer |
title_full | Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer |
title_fullStr | Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer |
title_full_unstemmed | Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer |
title_short | Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer |
title_sort | deep unsupervised clustering for prostate auto segmentation with and without hydrogel spacer |
topic | medical image segmentation deep learning unsupervised learning clustering hydrogel spacer |
url | https://doi.org/10.1088/2632-2153/ada8f3 |
work_keys_str_mv | AT hengruizhao deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer AT bilingwang deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer AT michaeldohopolski deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer AT tibai deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer AT stevejiang deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer AT dannguyen deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer |