Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic Segmentation

The introduction of an attention mechanism in remote sensing image segmentation improves the accuracy of the segmentation. In this paper, a novel multi-scale KAN-based linear attention (MKLA) segmentation network of MKLANet is developed to promote a better segmentation result. A hybrid global–local...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuanhang Li, Shuo Liu, Jie Wu, Weichao Sun, Qingke Wen, Yibiao Wu, Xiujuan Qin, Yanyou Qiao
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/5/802
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850053246433886208
author Yuanhang Li
Shuo Liu
Jie Wu
Weichao Sun
Qingke Wen
Yibiao Wu
Xiujuan Qin
Yanyou Qiao
author_facet Yuanhang Li
Shuo Liu
Jie Wu
Weichao Sun
Qingke Wen
Yibiao Wu
Xiujuan Qin
Yanyou Qiao
author_sort Yuanhang Li
collection DOAJ
description The introduction of an attention mechanism in remote sensing image segmentation improves the accuracy of the segmentation. In this paper, a novel multi-scale KAN-based linear attention (MKLA) segmentation network of MKLANet is developed to promote a better segmentation result. A hybrid global–local attention mechanism in a feature decoder is designed to enhance the ability of aggregating the global–local context and avoiding potential blocking artifacts for feature extraction and segmentation. The local attention channel adopts MKLA block by bringing the merits of KAN convolution in Mamba like the linear attention block to improve the ability of handling linear and nonlinear feature and complex function approximation with a few extra computations. The global attention channel uses long-range cascade encoder–decoder block, where it mainly employs the 7 × 7 depth-wise convolution token mixer and lightweight 7 × 7 dilated deep convolution to capture the long-distance spatial features field and retain key spatial information. In addition, to enrich the input of the attention block, a deformable convolution module is developed between the encoder output and corresponding scale decoder, which can improve the expression ability of the segmentation model without increasing the depth of the network. The experimental results of the Vaihingen dataset (83.68% in mIoU, 92.98 in OA, and 91.08 in mF1), the UAVid dataset (69.78% in mIoU, 96.51 in OA), the LoveDA dataset (51.53% in mIoU, 86.42% in OA, and 67.19% in mF1), and the Potsdam dataset (97.14% in mIoU, 92.64% in OA, and 93.8% in mF1) outperform other advanced attention-based approaches in terms of small targets and edges’ segmentation.
format Article
id doaj-art-197e36bd9b7d4b038a8dcc60fce1745a
institution DOAJ
issn 2072-4292
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-197e36bd9b7d4b038a8dcc60fce1745a2025-08-20T02:52:35ZengMDPI AGRemote Sensing2072-42922025-02-0117580210.3390/rs17050802Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic SegmentationYuanhang Li0Shuo Liu1Jie Wu2Weichao Sun3Qingke Wen4Yibiao Wu5Xiujuan Qin6Yanyou Qiao7National Engineering Research Center for Geomatics (NCG), Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaNational Engineering Research Center for Geomatics (NCG), Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaNational Engineering Research Center for Geomatics (NCG), Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaNational Engineering Research Center for Geomatics (NCG), Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaNational Engineering Research Center for Geomatics (NCG), Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaBeijing Institute of Control and Electronic Technology, Beijing 100038, ChinaBeijing Institute of Control and Electronic Technology, Beijing 100038, ChinaNational Engineering Research Center for Geomatics (NCG), Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaThe introduction of an attention mechanism in remote sensing image segmentation improves the accuracy of the segmentation. In this paper, a novel multi-scale KAN-based linear attention (MKLA) segmentation network of MKLANet is developed to promote a better segmentation result. A hybrid global–local attention mechanism in a feature decoder is designed to enhance the ability of aggregating the global–local context and avoiding potential blocking artifacts for feature extraction and segmentation. The local attention channel adopts MKLA block by bringing the merits of KAN convolution in Mamba like the linear attention block to improve the ability of handling linear and nonlinear feature and complex function approximation with a few extra computations. The global attention channel uses long-range cascade encoder–decoder block, where it mainly employs the 7 × 7 depth-wise convolution token mixer and lightweight 7 × 7 dilated deep convolution to capture the long-distance spatial features field and retain key spatial information. In addition, to enrich the input of the attention block, a deformable convolution module is developed between the encoder output and corresponding scale decoder, which can improve the expression ability of the segmentation model without increasing the depth of the network. The experimental results of the Vaihingen dataset (83.68% in mIoU, 92.98 in OA, and 91.08 in mF1), the UAVid dataset (69.78% in mIoU, 96.51 in OA), the LoveDA dataset (51.53% in mIoU, 86.42% in OA, and 67.19% in mF1), and the Potsdam dataset (97.14% in mIoU, 92.64% in OA, and 93.8% in mF1) outperform other advanced attention-based approaches in terms of small targets and edges’ segmentation.https://www.mdpi.com/2072-4292/17/5/802semantic segmentationremote sensing image segmentationattention mechanismdeformable convolutionmulti-scale feature fusion
spellingShingle Yuanhang Li
Shuo Liu
Jie Wu
Weichao Sun
Qingke Wen
Yibiao Wu
Xiujuan Qin
Yanyou Qiao
Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic Segmentation
Remote Sensing
semantic segmentation
remote sensing image segmentation
attention mechanism
deformable convolution
multi-scale feature fusion
title Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic Segmentation
title_full Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic Segmentation
title_fullStr Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic Segmentation
title_full_unstemmed Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic Segmentation
title_short Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic Segmentation
title_sort multi scale kolmogorov arnold network kan based linear attention network multi scale feature fusion with kan and deformable convolution for urban scene image semantic segmentation
topic semantic segmentation
remote sensing image segmentation
attention mechanism
deformable convolution
multi-scale feature fusion
url https://www.mdpi.com/2072-4292/17/5/802
work_keys_str_mv AT yuanhangli multiscalekolmogorovarnoldnetworkkanbasedlinearattentionnetworkmultiscalefeaturefusionwithkananddeformableconvolutionforurbansceneimagesemanticsegmentation
AT shuoliu multiscalekolmogorovarnoldnetworkkanbasedlinearattentionnetworkmultiscalefeaturefusionwithkananddeformableconvolutionforurbansceneimagesemanticsegmentation
AT jiewu multiscalekolmogorovarnoldnetworkkanbasedlinearattentionnetworkmultiscalefeaturefusionwithkananddeformableconvolutionforurbansceneimagesemanticsegmentation
AT weichaosun multiscalekolmogorovarnoldnetworkkanbasedlinearattentionnetworkmultiscalefeaturefusionwithkananddeformableconvolutionforurbansceneimagesemanticsegmentation
AT qingkewen multiscalekolmogorovarnoldnetworkkanbasedlinearattentionnetworkmultiscalefeaturefusionwithkananddeformableconvolutionforurbansceneimagesemanticsegmentation
AT yibiaowu multiscalekolmogorovarnoldnetworkkanbasedlinearattentionnetworkmultiscalefeaturefusionwithkananddeformableconvolutionforurbansceneimagesemanticsegmentation
AT xiujuanqin multiscalekolmogorovarnoldnetworkkanbasedlinearattentionnetworkmultiscalefeaturefusionwithkananddeformableconvolutionforurbansceneimagesemanticsegmentation
AT yanyouqiao multiscalekolmogorovarnoldnetworkkanbasedlinearattentionnetworkmultiscalefeaturefusionwithkananddeformableconvolutionforurbansceneimagesemanticsegmentation