MSU-Net: A Synthesized U-Net for Exploiting Multi-Scale Features in OCT Image Segmentation

The U-Net architecture is widely recognized as a prominent algorithm for choroidal segmentation in optical coherence tomography (OCT) images. However, conventional U-Net implementations exhibit two critical limitations. First, the backbone employs uniform-sized convolutional kernels to process featu...

Full description

Saved in:
Bibliographic Details
Main Authors: Dejie Chen, Xiangping Chen, Hao Gu, Su Zhao, Hao Jiang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10949143/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The U-Net architecture is widely recognized as a prominent algorithm for choroidal segmentation in optical coherence tomography (OCT) images. However, conventional U-Net implementations exhibit two critical limitations. First, the backbone employs uniform-sized convolutional kernels to process feature maps across all channels within the same layer, resulting in homogeneous receptive fields and a single-scale bottleneck that impedes global contextual feature extraction. Second, the skip connections are restricted to same-scale feature maps between encoder and decoder, failing to exploit cross-semantic hierarchical feature interactions. To address these issues, this study introduces MSU-Net, a novel neural network for OCT-based choroidal segmentation. The proposed framework enhances performance through two innovations: 1) replacement of standard encoder blocks with a multi-branch module combining heterogeneous convolutions to achieve multi-scale receptive field diversification; 2) redesign of skip connections through a pyramid fusion module with spatial attention for adaptive multi-level feature weighting. This architecture enables progressive refinement of low-level features guided by high-level semantics, significantly improving feature discriminability. Experimental results demonstrate superior performance with metrics of 99.5% (accuracy), 96.7% (sensitivity), 94.7% (Dice), and 94.6% (MIoU), surpassing the baseline by 0.4%, 3.7%, 2.8%, and 2.9% respectively. Notably, the model shows consistent advantages in segmenting indistinct choroidal boundaries compared to state-of-the-art methods.
ISSN:2169-3536