MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture
Intelligent analysis of 3D point clouds has become a frontier in emerging fields such as autonomous driving, digital twins, and the metaverse. Precise segmentation of 3D point clouds is particularly important within these domains; however, it faces several challenges: <xref ref-type="disp-fo...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10836688/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832592898416181248 |
---|---|
author | Hao Bai Xiongwei Li Qing Meng Shulong Zhuo Lili Yan |
author_facet | Hao Bai Xiongwei Li Qing Meng Shulong Zhuo Lili Yan |
author_sort | Hao Bai |
collection | DOAJ |
description | Intelligent analysis of 3D point clouds has become a frontier in emerging fields such as autonomous driving, digital twins, and the metaverse. Precise segmentation of 3D point clouds is particularly important within these domains; however, it faces several challenges: <xref ref-type="disp-formula" rid="deqn1">(1)</xref> point cloud data inherently lacks structured topological information; <xref ref-type="disp-formula" rid="deqn2">(2)</xref> point cloud shapes are complex and highly variable, making it difficult to utilize semantic priors; and <xref ref-type="disp-formula" rid="deqn3-deqn6">(3)</xref> the sampling process of point clouds may result in sparse and uneven data. To address these issues, this paper proposes a novel Point Cloud Segmentation Network based on multi-scale feature fusion and Transformer architecture (MFFTNet). MFFTNet enhances the performance of existing segmentation methods by globally modeling the overall point cloud shape and embedding local point cloud details. Specifically, MFFTNet divides the segmentation task into encoding and decoding stages. The encoder is designed as a hierarchical pyramid structure that extracts relatively sparse local center points and fuses local features during progressive downsampling. It also utilizes a Transformer for global feature modeling to establish multi-scale topological and semantic information of the point cloud. Subsequently, multi-scale feature fusion further enhances the network’s perception of local features and global structure. The decoder progressively upsamples to restore the original point cloud and injects multi-scale feature information to achieve precise segmentation. Based on the aforementioned encoding-decoding structure and multi-scale feature fusion, MFFTNet outperforms existing methods on the point cloud semantic segmentation datasets ShapeNetPart and S3DIS. |
format | Article |
id | doaj-art-ec5c6c904c9f4b13915b1044c0a2151f |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-ec5c6c904c9f4b13915b1044c0a2151f2025-01-21T00:02:15ZengIEEEIEEE Access2169-35362025-01-01139462947210.1109/ACCESS.2025.352824510836688MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer ArchitectureHao Bai0https://orcid.org/0009-0005-7801-5071Xiongwei Li1Qing Meng2Shulong Zhuo3Lili Yan4College of Information Engineering, Hainan Vocational University of Science and Technology, Haikou, ChinaSchool of Intelligent Manufacturing, Changzhou Vocational Institute of Engineering, Changzhou, ChinaSchool of Computer and Artificial Intelligence, Hainan College of Software Technology, Qionghai, ChinaCollege of Information Engineering, Hainan Vocational University of Science and Technology, Haikou, ChinaSchool of Computer and Artificial Intelligence, Hainan College of Software Technology, Qionghai, ChinaIntelligent analysis of 3D point clouds has become a frontier in emerging fields such as autonomous driving, digital twins, and the metaverse. Precise segmentation of 3D point clouds is particularly important within these domains; however, it faces several challenges: <xref ref-type="disp-formula" rid="deqn1">(1)</xref> point cloud data inherently lacks structured topological information; <xref ref-type="disp-formula" rid="deqn2">(2)</xref> point cloud shapes are complex and highly variable, making it difficult to utilize semantic priors; and <xref ref-type="disp-formula" rid="deqn3-deqn6">(3)</xref> the sampling process of point clouds may result in sparse and uneven data. To address these issues, this paper proposes a novel Point Cloud Segmentation Network based on multi-scale feature fusion and Transformer architecture (MFFTNet). MFFTNet enhances the performance of existing segmentation methods by globally modeling the overall point cloud shape and embedding local point cloud details. Specifically, MFFTNet divides the segmentation task into encoding and decoding stages. The encoder is designed as a hierarchical pyramid structure that extracts relatively sparse local center points and fuses local features during progressive downsampling. It also utilizes a Transformer for global feature modeling to establish multi-scale topological and semantic information of the point cloud. Subsequently, multi-scale feature fusion further enhances the network’s perception of local features and global structure. The decoder progressively upsamples to restore the original point cloud and injects multi-scale feature information to achieve precise segmentation. Based on the aforementioned encoding-decoding structure and multi-scale feature fusion, MFFTNet outperforms existing methods on the point cloud semantic segmentation datasets ShapeNetPart and S3DIS.https://ieeexplore.ieee.org/document/10836688/Deep learningmulti-scale feature fusionpoint cloud segmentationtransformer architecture |
spellingShingle | Hao Bai Xiongwei Li Qing Meng Shulong Zhuo Lili Yan MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture IEEE Access Deep learning multi-scale feature fusion point cloud segmentation transformer architecture |
title | MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture |
title_full | MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture |
title_fullStr | MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture |
title_full_unstemmed | MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture |
title_short | MFFTNet: A Novel 3D Point Cloud Segmentation Network Based on Multi-Scale Feature Fusion and Transformer Architecture |
title_sort | mfftnet a novel 3d point cloud segmentation network based on multi scale feature fusion and transformer architecture |
topic | Deep learning multi-scale feature fusion point cloud segmentation transformer architecture |
url | https://ieeexplore.ieee.org/document/10836688/ |
work_keys_str_mv | AT haobai mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture AT xiongweili mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture AT qingmeng mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture AT shulongzhuo mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture AT liliyan mfftnetanovel3dpointcloudsegmentationnetworkbasedonmultiscalefeaturefusionandtransformerarchitecture |