Medical image segmentation based on frequency domain decomposition SVD linear attention

Abstract Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to th...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu Qiong, Li Chaofan, Teng Jinnan, Chen Liping, Song Jianxiang
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-86315-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832585730843475968
author Liu Qiong
Li Chaofan
Teng Jinnan
Chen Liping
Song Jianxiang
author_facet Liu Qiong
Li Chaofan
Teng Jinnan
Chen Liping
Song Jianxiang
author_sort Liu Qiong
collection DOAJ
description Abstract Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to the limited receptive field of convolutional kernels in CNNs. Despite this, ViT models struggle to fully detect and extract high-frequency signals, such as textures and boundaries, in medical images. These high-frequency features are essential in medical imaging, as targets like tumors and pathological organs exhibit significant differences in texture and boundaries across different stages. Additionally, the high resolution of medical images leads to computational complexity in the self-attention mechanism of ViTs. To address these limitations, we propose a medical image segmentation network framework based on frequency domain decomposition using a Laplacian pyramid. This approach selectively computes attention features for high-frequency signals in the original image to enhance spatial structural information effectively. During attention feature computation, we introduce Singular Value Decomposition (SVD) to extract an effective representation matrix from the original image, which is then applied in the attention computation process for linear projection. This method reduces computational complexity while preserving essential features. We demonstrated the segmentation validity and superiority of our model on the Abdominal Multi-Organ Segmentation dataset and the Dermatological Disease dataset, and on the Synapse dataset our model achieved a score of 82.68 on the Dice metrics and 17.23 mm on the HD metrics. Experimental results indicate that our model consistently exhibits segmentation effectiveness and improved accuracy across various datasets.
format Article
id doaj-art-a88ea1b9f47a4c17951c3acb5fe5d0b6
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-a88ea1b9f47a4c17951c3acb5fe5d0b62025-01-26T12:32:36ZengNature PortfolioScientific Reports2045-23222025-01-0115111410.1038/s41598-025-86315-1Medical image segmentation based on frequency domain decomposition SVD linear attentionLiu Qiong0Li Chaofan1Teng Jinnan2Chen Liping3Song Jianxiang4School of Medical Imaging, Jiangsu Medical CollegeAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAbstract Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to the limited receptive field of convolutional kernels in CNNs. Despite this, ViT models struggle to fully detect and extract high-frequency signals, such as textures and boundaries, in medical images. These high-frequency features are essential in medical imaging, as targets like tumors and pathological organs exhibit significant differences in texture and boundaries across different stages. Additionally, the high resolution of medical images leads to computational complexity in the self-attention mechanism of ViTs. To address these limitations, we propose a medical image segmentation network framework based on frequency domain decomposition using a Laplacian pyramid. This approach selectively computes attention features for high-frequency signals in the original image to enhance spatial structural information effectively. During attention feature computation, we introduce Singular Value Decomposition (SVD) to extract an effective representation matrix from the original image, which is then applied in the attention computation process for linear projection. This method reduces computational complexity while preserving essential features. We demonstrated the segmentation validity and superiority of our model on the Abdominal Multi-Organ Segmentation dataset and the Dermatological Disease dataset, and on the Synapse dataset our model achieved a score of 82.68 on the Dice metrics and 17.23 mm on the HD metrics. Experimental results indicate that our model consistently exhibits segmentation effectiveness and improved accuracy across various datasets.https://doi.org/10.1038/s41598-025-86315-1TransformerFrequency domain decompositionSVD DecompositionImage segmentation
spellingShingle Liu Qiong
Li Chaofan
Teng Jinnan
Chen Liping
Song Jianxiang
Medical image segmentation based on frequency domain decomposition SVD linear attention
Scientific Reports
Transformer
Frequency domain decomposition
SVD Decomposition
Image segmentation
title Medical image segmentation based on frequency domain decomposition SVD linear attention
title_full Medical image segmentation based on frequency domain decomposition SVD linear attention
title_fullStr Medical image segmentation based on frequency domain decomposition SVD linear attention
title_full_unstemmed Medical image segmentation based on frequency domain decomposition SVD linear attention
title_short Medical image segmentation based on frequency domain decomposition SVD linear attention
title_sort medical image segmentation based on frequency domain decomposition svd linear attention
topic Transformer
Frequency domain decomposition
SVD Decomposition
Image segmentation
url https://doi.org/10.1038/s41598-025-86315-1
work_keys_str_mv AT liuqiong medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention
AT lichaofan medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention
AT tengjinnan medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention
AT chenliping medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention
AT songjianxiang medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention