Medical image segmentation based on frequency domain decomposition SVD linear attention
Abstract Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to th...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-025-86315-1 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832585730843475968 |
---|---|
author | Liu Qiong Li Chaofan Teng Jinnan Chen Liping Song Jianxiang |
author_facet | Liu Qiong Li Chaofan Teng Jinnan Chen Liping Song Jianxiang |
author_sort | Liu Qiong |
collection | DOAJ |
description | Abstract Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to the limited receptive field of convolutional kernels in CNNs. Despite this, ViT models struggle to fully detect and extract high-frequency signals, such as textures and boundaries, in medical images. These high-frequency features are essential in medical imaging, as targets like tumors and pathological organs exhibit significant differences in texture and boundaries across different stages. Additionally, the high resolution of medical images leads to computational complexity in the self-attention mechanism of ViTs. To address these limitations, we propose a medical image segmentation network framework based on frequency domain decomposition using a Laplacian pyramid. This approach selectively computes attention features for high-frequency signals in the original image to enhance spatial structural information effectively. During attention feature computation, we introduce Singular Value Decomposition (SVD) to extract an effective representation matrix from the original image, which is then applied in the attention computation process for linear projection. This method reduces computational complexity while preserving essential features. We demonstrated the segmentation validity and superiority of our model on the Abdominal Multi-Organ Segmentation dataset and the Dermatological Disease dataset, and on the Synapse dataset our model achieved a score of 82.68 on the Dice metrics and 17.23 mm on the HD metrics. Experimental results indicate that our model consistently exhibits segmentation effectiveness and improved accuracy across various datasets. |
format | Article |
id | doaj-art-a88ea1b9f47a4c17951c3acb5fe5d0b6 |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-a88ea1b9f47a4c17951c3acb5fe5d0b62025-01-26T12:32:36ZengNature PortfolioScientific Reports2045-23222025-01-0115111410.1038/s41598-025-86315-1Medical image segmentation based on frequency domain decomposition SVD linear attentionLiu Qiong0Li Chaofan1Teng Jinnan2Chen Liping3Song Jianxiang4School of Medical Imaging, Jiangsu Medical CollegeAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAffiliated Hospital 6 of Nantong University, Yancheng Third People’s HospitalAbstract Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to the limited receptive field of convolutional kernels in CNNs. Despite this, ViT models struggle to fully detect and extract high-frequency signals, such as textures and boundaries, in medical images. These high-frequency features are essential in medical imaging, as targets like tumors and pathological organs exhibit significant differences in texture and boundaries across different stages. Additionally, the high resolution of medical images leads to computational complexity in the self-attention mechanism of ViTs. To address these limitations, we propose a medical image segmentation network framework based on frequency domain decomposition using a Laplacian pyramid. This approach selectively computes attention features for high-frequency signals in the original image to enhance spatial structural information effectively. During attention feature computation, we introduce Singular Value Decomposition (SVD) to extract an effective representation matrix from the original image, which is then applied in the attention computation process for linear projection. This method reduces computational complexity while preserving essential features. We demonstrated the segmentation validity and superiority of our model on the Abdominal Multi-Organ Segmentation dataset and the Dermatological Disease dataset, and on the Synapse dataset our model achieved a score of 82.68 on the Dice metrics and 17.23 mm on the HD metrics. Experimental results indicate that our model consistently exhibits segmentation effectiveness and improved accuracy across various datasets.https://doi.org/10.1038/s41598-025-86315-1TransformerFrequency domain decompositionSVD DecompositionImage segmentation |
spellingShingle | Liu Qiong Li Chaofan Teng Jinnan Chen Liping Song Jianxiang Medical image segmentation based on frequency domain decomposition SVD linear attention Scientific Reports Transformer Frequency domain decomposition SVD Decomposition Image segmentation |
title | Medical image segmentation based on frequency domain decomposition SVD linear attention |
title_full | Medical image segmentation based on frequency domain decomposition SVD linear attention |
title_fullStr | Medical image segmentation based on frequency domain decomposition SVD linear attention |
title_full_unstemmed | Medical image segmentation based on frequency domain decomposition SVD linear attention |
title_short | Medical image segmentation based on frequency domain decomposition SVD linear attention |
title_sort | medical image segmentation based on frequency domain decomposition svd linear attention |
topic | Transformer Frequency domain decomposition SVD Decomposition Image segmentation |
url | https://doi.org/10.1038/s41598-025-86315-1 |
work_keys_str_mv | AT liuqiong medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention AT lichaofan medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention AT tengjinnan medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention AT chenliping medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention AT songjianxiang medicalimagesegmentationbasedonfrequencydomaindecompositionsvdlinearattention |