Medical image segmentation based on frequency domain decomposition SVD linear attention

Abstract Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to th...

Full description

Saved in:

Bibliographic Details
Main Authors:	Liu Qiong, Li Chaofan, Teng Jinnan, Chen Liping, Song Jianxiang
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Scientific Reports
Subjects:	Transformer Frequency domain decomposition SVD Decomposition Image segmentation
Online Access:	https://doi.org/10.1038/s41598-025-86315-1
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832585730843475968
author	Liu Qiong Li Chaofan Teng Jinnan Chen Liping Song Jianxiang
author_facet	Liu Qiong Li Chaofan Teng Jinnan Chen Liping Song Jianxiang
author_sort	Liu Qiong
collection	DOAJ
description	Abstract Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to the limited receptive field of convolutional kernels in CNNs. Despite this, ViT models struggle to fully detect and extract high-frequency signals, such as textures and boundaries, in medical images. These high-frequency features are essential in medical imaging, as targets like tumors and pathological organs exhibit significant differences in texture and boundaries across different stages. Additionally, the high resolution of medical images leads to computational complexity in the self-attention mechanism of ViTs. To address these limitations, we propose a medical image segmentation network framework based on frequency domain decomposition using a Laplacian pyramid. This approach selectively computes attention features for high-frequency signals in the original image to enhance spatial structural information effectively. During attention feature computation, we introduce Singular Value Decomposition (SVD) to extract an effective representation matrix from the original image, which is then applied in the attention computation process for linear projection. This method reduces computational complexity while preserving essential features. We demonstrated the segmentation validity and superiority of our model on the Abdominal Multi-Organ Segmentation dataset and the Dermatological Disease dataset, and on the Synapse dataset our model achieved a score of 82.68 on the Dice metrics and 17.23 mm on the HD metrics. Experimental results indicate that our model consistently exhibits segmentation effectiveness and improved accuracy across various datasets.
format	Article
id	doaj-art-a88ea1b9f47a4c17951c3acb5fe5d0b6
institution	Kabale University
issn	2045-2322
language	English
publishDate	2025-01-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-a88ea1b9f47a4c17951c3acb5fe5d0b62025-01-26T12:32:36ZengNature PortfolioScientific Reports2045-23222025-01-0115111410.1038/s41598-025-86315-1Medical image segmentation based on frequency domain decomposition SVD linear attentionLiu Qiong0Li Chaofan1Teng Jinnan2Chen Liping3Song Jianxiang4School of Medical Imaging, Jiangsu Medical CollegeAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAbstract Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to the limited receptive field of convolutional kernels in CNNs. Despite this, ViT models struggle to fully detect and extract high-frequency signals, such as textures and boundaries, in medical images. These high-frequency features are essential in medical imaging, as targets like tumors and pathological organs exhibit significant differences in texture and boundaries across different stages. Additionally, the high resolution of medical images leads to computational complexity in the self-attention mechanism of ViTs. To address these limitations, we propose a medical image segmentation network framework based on frequency domain decomposition using a Laplacian pyramid. This approach selectively computes attention features for high-frequency signals in the original image to enhance spatial structural information effectively. During attention feature computation, we introduce Singular Value Decomposition (SVD) to extract an effective representation matrix from the original image, which is then applied in the attention computation process for linear projection. This method reduces computational complexity while preserving essential features. We demonstrated the segmentation validity and superiority of our model on the Abdominal Multi-Organ Segmentation dataset and the Dermatological Disease dataset, and on the Synapse dataset our model achieved a score of 82.68 on the Dice metrics and 17.23 mm on the HD metrics. Experimental results indicate that our model consistently exhibits segmentation effectiveness and improved accuracy across various datasets.https://doi.org/10.1038/s41598-025-86315-1TransformerFrequency domain decompositionSVD DecompositionImage segmentation
spellingShingle	Liu Qiong Li Chaofan Teng Jinnan Chen Liping Song Jianxiang Medical image segmentation based on frequency domain decomposition SVD linear attention Scientific Reports Transformer Frequency domain decomposition SVD Decomposition Image segmentation
title	Medical image segmentation based on frequency domain decomposition SVD linear attention
title_full	Medical image segmentation based on frequency domain decomposition SVD linear attention
title_fullStr	Medical image segmentation based on frequency domain decomposition SVD linear attention
title_full_unstemmed	Medical image segmentation based on frequency domain decomposition SVD linear attention
title_short	Medical image segmentation based on frequency domain decomposition SVD linear attention
title_sort	medical image segmentation based on frequency domain decomposition svd linear attention
topic	Transformer Frequency domain decomposition SVD Decomposition Image segmentation
url	https://doi.org/10.1038/s41598-025-86315-1
work_keys_str_mv	AT liuqiong medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention AT lichaofan medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention AT tengjinnan medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention AT chenliping medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention AT songjianxiang medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention

Medical image segmentation based on frequency domain decomposition SVD linear attention

Similar Items