MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture

Intelligent analysis of 3D point clouds has become a frontier in emerging fields such as autonomous driving, digital twins, and the metaverse. Precise segmentation of 3D point clouds is particularly important within these domains; however, it faces several challenges: <xref ref-type="disp-fo...

Full description

Saved in:
Bibliographic Details
Main Authors: Hao Bai, Xiongwei Li, Qing Meng, Shulong Zhuo, Lili Yan
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10836688/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832592898416181248
author Hao Bai
Xiongwei Li
Qing Meng
Shulong Zhuo
Lili Yan
author_facet Hao Bai
Xiongwei Li
Qing Meng
Shulong Zhuo
Lili Yan
author_sort Hao Bai
collection DOAJ
description Intelligent analysis of 3D point clouds has become a frontier in emerging fields such as autonomous driving, digital twins, and the metaverse. Precise segmentation of 3D point clouds is particularly important within these domains; however, it faces several challenges: <xref ref-type="disp-formula" rid="deqn1">(1)</xref> point cloud data inherently lacks structured topological information; <xref ref-type="disp-formula" rid="deqn2">(2)</xref> point cloud shapes are complex and highly variable, making it difficult to utilize semantic priors; and <xref ref-type="disp-formula" rid="deqn3-deqn6">(3)</xref> the sampling process of point clouds may result in sparse and uneven data. To address these issues, this paper proposes a novel Point Cloud Segmentation Network based on multi-scale feature fusion and Transformer architecture (MFFTNet). MFFTNet enhances the performance of existing segmentation methods by globally modeling the overall point cloud shape and embedding local point cloud details. Specifically, MFFTNet divides the segmentation task into encoding and decoding stages. The encoder is designed as a hierarchical pyramid structure that extracts relatively sparse local center points and fuses local features during progressive downsampling. It also utilizes a Transformer for global feature modeling to establish multi-scale topological and semantic information of the point cloud. Subsequently, multi-scale feature fusion further enhances the network&#x2019;s perception of local features and global structure. The decoder progressively upsamples to restore the original point cloud and injects multi-scale feature information to achieve precise segmentation. Based on the aforementioned encoding-decoding structure and multi-scale feature fusion, MFFTNet outperforms existing methods on the point cloud semantic segmentation datasets ShapeNetPart and S3DIS.
format Article
id doaj-art-ec5c6c904c9f4b13915b1044c0a2151f
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-ec5c6c904c9f4b13915b1044c0a2151f2025-01-21T00:02:15ZengIEEEIEEE Access2169-35362025-01-01139462947210.1109/ACCESS.2025.352824510836688MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer ArchitectureHao Bai0https://orcid.org/0009-0005-7801-5071Xiongwei Li1Qing Meng2Shulong Zhuo3Lili Yan4College of Information Engineering, Hainan Vocational University of Science and Technology, Haikou, ChinaSchool of Intelligent Manufacturing, Changzhou Vocational Institute of Engineering, Changzhou, ChinaSchool of Computer and Artificial Intelligence, Hainan College of Software Technology, Qionghai, ChinaCollege of Information Engineering, Hainan Vocational University of Science and Technology, Haikou, ChinaSchool of Computer and Artificial Intelligence, Hainan College of Software Technology, Qionghai, ChinaIntelligent analysis of 3D point clouds has become a frontier in emerging fields such as autonomous driving, digital twins, and the metaverse. Precise segmentation of 3D point clouds is particularly important within these domains; however, it faces several challenges: <xref ref-type="disp-formula" rid="deqn1">(1)</xref> point cloud data inherently lacks structured topological information; <xref ref-type="disp-formula" rid="deqn2">(2)</xref> point cloud shapes are complex and highly variable, making it difficult to utilize semantic priors; and <xref ref-type="disp-formula" rid="deqn3-deqn6">(3)</xref> the sampling process of point clouds may result in sparse and uneven data. To address these issues, this paper proposes a novel Point Cloud Segmentation Network based on multi-scale feature fusion and Transformer architecture (MFFTNet). MFFTNet enhances the performance of existing segmentation methods by globally modeling the overall point cloud shape and embedding local point cloud details. Specifically, MFFTNet divides the segmentation task into encoding and decoding stages. The encoder is designed as a hierarchical pyramid structure that extracts relatively sparse local center points and fuses local features during progressive downsampling. It also utilizes a Transformer for global feature modeling to establish multi-scale topological and semantic information of the point cloud. Subsequently, multi-scale feature fusion further enhances the network&#x2019;s perception of local features and global structure. The decoder progressively upsamples to restore the original point cloud and injects multi-scale feature information to achieve precise segmentation. Based on the aforementioned encoding-decoding structure and multi-scale feature fusion, MFFTNet outperforms existing methods on the point cloud semantic segmentation datasets ShapeNetPart and S3DIS.https://ieeexplore.ieee.org/document/10836688/Deep learningmulti-scale feature fusionpoint cloud segmentationtransformer architecture
spellingShingle Hao Bai
Xiongwei Li
Qing Meng
Shulong Zhuo
Lili Yan
MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture
IEEE Access
Deep learning
multi-scale feature fusion
point cloud segmentation
transformer architecture
title MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture
title_full MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture
title_fullStr MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture
title_full_unstemmed MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture
title_short MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture
title_sort mfftnet a novel 3d point cloud segmentation network based on multi scale feature fusion and transformer architecture
topic Deep learning
multi-scale feature fusion
point cloud segmentation
transformer architecture
url https://ieeexplore.ieee.org/document/10836688/
work_keys_str_mv AT haobai mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture
AT xiongweili mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture
AT qingmeng mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture
AT shulongzhuo mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture
AT liliyan mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture