DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point Clouds

Semantic segmentation of large-scale point clouds is essential for applications such as autonomous driving and high-definition mapping. However, this task remains challenging due to the imbalanced distribution of categories in large-scale point cloud data and the similarity in local geometric struct...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhipeng He, Jing Liu, Shuai Yang
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10652235/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850040922380697600
author Zhipeng He
Jing Liu
Shuai Yang
author_facet Zhipeng He
Jing Liu
Shuai Yang
author_sort Zhipeng He
collection DOAJ
description Semantic segmentation of large-scale point clouds is essential for applications such as autonomous driving and high-definition mapping. However, this task remains challenging due to the imbalanced distribution of categories in large-scale point cloud data and the similarity in local geometric structures. Most current deep learning&#x2013;based methods concentrate on designing local feature extraction modules while neglecting the significance of long-distance contextual information. Nevertheless, this contextual information is crucial for accurate object segmentation in large-scale scenes. To address this limitation, we propose a dual-encoder segmentation network called DE-Net. DE-Net effectively learns both the local and long-distance contextual information for each point to achieve accurate point segmentation. DE-Net consists of two main components: dual-encoder modules (DEMs) and gradient-aware pooling modules (GAPM). DEMs extract local geometry and long-distance contextual information for each point using positional and trigonometric encoding to distinguish complex geometric features. GAPMs aggregate global information effectively using dual-distance and <italic>xy</italic> gradient information. In addition, a prediction jitter module was introduced during training to address the issue of class imbalance and improve the network&#x0027;s prediction results. The experimental results on three public benchmarks demonstrate that DE-Net outperforms existing state-of-the-art methods, achieving mean intersection over union scores of 83.5&#x0025;, 61.8&#x0025;, and 63.9&#x0025; on Toronto-3D, WHU-MLS, and S3DIS datasets, respectively.
format Article
id doaj-art-eab0f0f0405f4af8b64491655a3fba4e
institution DOAJ
issn 1939-1404
2151-1535
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-eab0f0f0405f4af8b64491655a3fba4e2025-08-20T02:55:56ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352024-01-0117159141592610.1109/JSTARS.2024.345070810652235DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point CloudsZhipeng He0https://orcid.org/0009-0008-5465-3468Jing Liu1https://orcid.org/0000-0001-5207-7614Shuai Yang2Key Laboratory of Virtual Geographic Environment (Nanjing Normal University), Ministry of Education, Nanjing, ChinaKey Laboratory of Virtual Geographic Environment (Nanjing Normal University), Ministry of Education, Nanjing, China31682 Troop of People&#x0027;s Liberation Army, Lanzhou, ChinaSemantic segmentation of large-scale point clouds is essential for applications such as autonomous driving and high-definition mapping. However, this task remains challenging due to the imbalanced distribution of categories in large-scale point cloud data and the similarity in local geometric structures. Most current deep learning&#x2013;based methods concentrate on designing local feature extraction modules while neglecting the significance of long-distance contextual information. Nevertheless, this contextual information is crucial for accurate object segmentation in large-scale scenes. To address this limitation, we propose a dual-encoder segmentation network called DE-Net. DE-Net effectively learns both the local and long-distance contextual information for each point to achieve accurate point segmentation. DE-Net consists of two main components: dual-encoder modules (DEMs) and gradient-aware pooling modules (GAPM). DEMs extract local geometry and long-distance contextual information for each point using positional and trigonometric encoding to distinguish complex geometric features. GAPMs aggregate global information effectively using dual-distance and <italic>xy</italic> gradient information. In addition, a prediction jitter module was introduced during training to address the issue of class imbalance and improve the network&#x0027;s prediction results. The experimental results on three public benchmarks demonstrate that DE-Net outperforms existing state-of-the-art methods, achieving mean intersection over union scores of 83.5&#x0025;, 61.8&#x0025;, and 63.9&#x0025; on Toronto-3D, WHU-MLS, and S3DIS datasets, respectively.https://ieeexplore.ieee.org/document/10652235/Deep learningdual-encodersemantic segmentation3-D point cloud
spellingShingle Zhipeng He
Jing Liu
Shuai Yang
DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point Clouds
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Deep learning
dual-encoder
semantic segmentation
3-D point cloud
title DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point Clouds
title_full DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point Clouds
title_fullStr DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point Clouds
title_full_unstemmed DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point Clouds
title_short DE-Net: A Dual-Encoder Network for Local and Long-Distance Context Information Extraction in Semantic Segmentation of Large-Scale Scene Point Clouds
title_sort de net a dual encoder network for local and long distance context information extraction in semantic segmentation of large scale scene point clouds
topic Deep learning
dual-encoder
semantic segmentation
3-D point cloud
url https://ieeexplore.ieee.org/document/10652235/
work_keys_str_mv AT zhipenghe denetadualencodernetworkforlocalandlongdistancecontextinformationextractioninsemanticsegmentationoflargescalescenepointclouds
AT jingliu denetadualencodernetworkforlocalandlongdistancecontextinformationextractioninsemanticsegmentationoflargescalescenepointclouds
AT shuaiyang denetadualencodernetworkforlocalandlongdistancecontextinformationextractioninsemanticsegmentationoflargescalescenepointclouds