DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point Clouds

Semantic segmentation of large-scale point clouds is essential for applications such as autonomous driving and high-definition mapping. However, this task remains challenging due to the imbalanced distribution of categories in large-scale point cloud data and the similarity in local geometric struct...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhipeng He, Jing Liu, Shuai Yang
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:	Deep learning dual-encoder semantic segmentation 3-D point cloud
Online Access:	https://ieeexplore.ieee.org/document/10652235/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Semantic segmentation of large-scale point clouds is essential for applications such as autonomous driving and high-definition mapping. However, this task remains challenging due to the imbalanced distribution of categories in large-scale point cloud data and the similarity in local geometric structures. Most current deep learning–based methods concentrate on designing local feature extraction modules while neglecting the significance of long-distance contextual information. Nevertheless, this contextual information is crucial for accurate object segmentation in large-scale scenes. To address this limitation, we propose a dual-encoder segmentation network called DE-Net. DE-Net effectively learns both the local and long-distance contextual information for each point to achieve accurate point segmentation. DE-Net consists of two main components: dual-encoder modules (DEMs) and gradient-aware pooling modules (GAPM). DEMs extract local geometry and long-distance contextual information for each point using positional and trigonometric encoding to distinguish complex geometric features. GAPMs aggregate global information effectively using dual-distance and <italic>xy</italic> gradient information. In addition, a prediction jitter module was introduced during training to address the issue of class imbalance and improve the network's prediction results. The experimental results on three public benchmarks demonstrate that DE-Net outperforms existing state-of-the-art methods, achieving mean intersection over union scores of 83.5%, 61.8%, and 63.9% on Toronto-3D, WHU-MLS, and S3DIS datasets, respectively.
ISSN:	1939-1404 2151-1535

DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point Clouds

Similar Items