Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer

Introduction. Clinical datasets for training deep learning (DL) models often exhibit high levels of heterogeneity due to differences such as patient characteristics, new medical techniques, and physician preferences. In recent years, hydrogel spacers have been used in some prostate cancer patients r...

Full description

Saved in:
Bibliographic Details
Main Authors: Hengrui Zhao, Biling Wang, Michael Dohopolski, Ti Bai, Steve Jiang, Dan Nguyen
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:Machine Learning: Science and Technology
Subjects:
Online Access:https://doi.org/10.1088/2632-2153/ada8f3
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832590923758829568
author Hengrui Zhao
Biling Wang
Michael Dohopolski
Ti Bai
Steve Jiang
Dan Nguyen
author_facet Hengrui Zhao
Biling Wang
Michael Dohopolski
Ti Bai
Steve Jiang
Dan Nguyen
author_sort Hengrui Zhao
collection DOAJ
description Introduction. Clinical datasets for training deep learning (DL) models often exhibit high levels of heterogeneity due to differences such as patient characteristics, new medical techniques, and physician preferences. In recent years, hydrogel spacers have been used in some prostate cancer patients receiving radiotherapy to separate the prostate and the rectum to better spare the rectum while achieving adequate dose coverage on the prostate. However, this substantially affects the computed tomography image appearance, which downstream reduced the contouring accuracy of auto-segmentation algorithms. This leads to highly heterogeneous dataset. Methods. To address this issue, we propose to identify underlying clusters within the dataset and use the cluster labels for segmentation. We collected a clinical dataset of 909 patients, including those with two types of hydrogel spacers and those without. First, we trained a DL model to locate the prostate and limit our field of view to the local area surrounding the prostate and rectum. We then used Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and employed k-means clustering to assign each patient to a cluster. To leverage this clustered data, we propose a text-guided segmentation model, contrastive language and image pre-training (CLIP)-UNet, which encodes the cluster information using a text encoder and combines the encoded text information with image features for segmentation. Results. The UMAP results indicated up to three clusters within the dataset. CLIP-UNet with cluster information achieved a Dice score of 86.2% compared to 84.4% from the baseline UNet. Additionally, CLIP-UNet outperforms other state-of-the-art models with or without cluster information. Conclusion. Automatic clustering assisted by DL can reveal hidden data clusters in clinical datasets, and CLIP-UNet effectively utilizes clustered labels and achieves higher performance.
format Article
id doaj-art-e6d79ec8a85e4e8392922ad325fecb1b
institution Kabale University
issn 2632-2153
language English
publishDate 2025-01-01
publisher IOP Publishing
record_format Article
series Machine Learning: Science and Technology
spelling doaj-art-e6d79ec8a85e4e8392922ad325fecb1b2025-01-23T06:30:25ZengIOP PublishingMachine Learning: Science and Technology2632-21532025-01-016101501510.1088/2632-2153/ada8f3Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacerHengrui Zhao0https://orcid.org/0000-0002-6712-5823Biling Wang1https://orcid.org/0000-0002-6894-8645Michael Dohopolski2https://orcid.org/0000-0002-9043-1490Ti Bai3https://orcid.org/0000-0002-6697-7434Steve Jiang4https://orcid.org/0000-0002-3083-6752Dan Nguyen5https://orcid.org/0000-0002-9590-0655Medical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaMedical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center , Dallas, TX, United States of AmericaIntroduction. Clinical datasets for training deep learning (DL) models often exhibit high levels of heterogeneity due to differences such as patient characteristics, new medical techniques, and physician preferences. In recent years, hydrogel spacers have been used in some prostate cancer patients receiving radiotherapy to separate the prostate and the rectum to better spare the rectum while achieving adequate dose coverage on the prostate. However, this substantially affects the computed tomography image appearance, which downstream reduced the contouring accuracy of auto-segmentation algorithms. This leads to highly heterogeneous dataset. Methods. To address this issue, we propose to identify underlying clusters within the dataset and use the cluster labels for segmentation. We collected a clinical dataset of 909 patients, including those with two types of hydrogel spacers and those without. First, we trained a DL model to locate the prostate and limit our field of view to the local area surrounding the prostate and rectum. We then used Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and employed k-means clustering to assign each patient to a cluster. To leverage this clustered data, we propose a text-guided segmentation model, contrastive language and image pre-training (CLIP)-UNet, which encodes the cluster information using a text encoder and combines the encoded text information with image features for segmentation. Results. The UMAP results indicated up to three clusters within the dataset. CLIP-UNet with cluster information achieved a Dice score of 86.2% compared to 84.4% from the baseline UNet. Additionally, CLIP-UNet outperforms other state-of-the-art models with or without cluster information. Conclusion. Automatic clustering assisted by DL can reveal hidden data clusters in clinical datasets, and CLIP-UNet effectively utilizes clustered labels and achieves higher performance.https://doi.org/10.1088/2632-2153/ada8f3medical image segmentationdeep learningunsupervised learningclusteringhydrogel spacer
spellingShingle Hengrui Zhao
Biling Wang
Michael Dohopolski
Ti Bai
Steve Jiang
Dan Nguyen
Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer
Machine Learning: Science and Technology
medical image segmentation
deep learning
unsupervised learning
clustering
hydrogel spacer
title Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer
title_full Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer
title_fullStr Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer
title_full_unstemmed Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer
title_short Deep unsupervised clustering for prostate auto-segmentation with and without hydrogel spacer
title_sort deep unsupervised clustering for prostate auto segmentation with and without hydrogel spacer
topic medical image segmentation
deep learning
unsupervised learning
clustering
hydrogel spacer
url https://doi.org/10.1088/2632-2153/ada8f3
work_keys_str_mv AT hengruizhao deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer
AT bilingwang deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer
AT michaeldohopolski deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer
AT tibai deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer
AT stevejiang deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer
AT dannguyen deepunsupervisedclusteringforprostateautosegmentationwithandwithouthydrogelspacer