PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation

LiDAR-captured 3D point clouds are widely used in self-driving cars and smart cities. Point-based semantic segmentation methods allow for more efficient use of the rich geometric information contained in 3D point clouds, so it has gradually replaced other methods as the mainstream deep learning meth...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hong Yi, Yaru Liu, Ming Wang
Format:	Article
Language:	English
Published:	MDPI AG 2025-06-01
Series:	Remote Sensing
Subjects:	patch-based self-attention 3D point cloud semantic segmentation receptive field
Online Access:	https://www.mdpi.com/2072-4292/17/12/2012
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849426005509275648
author	Hong Yi Yaru Liu Ming Wang
author_facet	Hong Yi Yaru Liu Ming Wang
author_sort	Hong Yi
collection	DOAJ
description	LiDAR-captured 3D point clouds are widely used in self-driving cars and smart cities. Point-based semantic segmentation methods allow for more efficient use of the rich geometric information contained in 3D point clouds, so it has gradually replaced other methods as the mainstream deep learning method in 3D point cloud semantic segmentation. However, existing methods suffer from limited receptive fields and feature misalignment due to hierarchical downsampling. To address these challenges, we propose PSNet, a novel patch-based self-attention network that significantly expands the receptive field while ensuring feature alignment through a patch-aggregation paradigm. PSNet combines patch-based self-attention feature extraction with common point feature aggregation (CPFA) to implicitly model large-scale spatial relationships. The framework first divides the point cloud into overlapping patches to extract local features via multi-head self-attention, then aggregates features of common points across patches to capture long-range context. Extensive experiments on Toronto-3D and Complex Scene Point Cloud (CSPC) datasets validate PSNet’s state-of-the-art performance, achieving overall accuracies (OAs) of 98.4% and 97.2%, respectively, with significant improvements in challenging categories (e.g., +32.1% IoU for fences). Experimental results on the S3DIS dataset show that PSNet attains competitive mIoU accuracy (71.2%) while maintaining lower inference latency (7.03 s). The PSNet architecture achieves a larger receptive field coverage, which represents a significant advantage over existing methods. This work not only reveals the mechanism of patch-based self-attention for receptive field enhancement but also provides insights into attention-based 3D geometric learning and semantic segmentation architectures. Furthermore, it provides substantial references for applications in autonomous vehicle navigation and smart city infrastructure management.
format	Article
id	doaj-art-261260dc7a6940c8bc5e9bf06f35eb77
institution	Kabale University
issn	2072-4292
language	English
publishDate	2025-06-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj-art-261260dc7a6940c8bc5e9bf06f35eb772025-08-20T03:29:35ZengMDPI AGRemote Sensing2072-42922025-06-011712201210.3390/rs17122012PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic SegmentationHong Yi0Yaru Liu1Ming Wang2College of Geographical Sciences, Harbin Normal University, Harbin 150025, ChinaGuangdong Urban-Rural Planning and Design Research Institute Technology Group Co., Ltd., Guangzhou 510290, ChinaInspur Cloud Information Technology Co., Ltd., Jinan 250101, ChinaLiDAR-captured 3D point clouds are widely used in self-driving cars and smart cities. Point-based semantic segmentation methods allow for more efficient use of the rich geometric information contained in 3D point clouds, so it has gradually replaced other methods as the mainstream deep learning method in 3D point cloud semantic segmentation. However, existing methods suffer from limited receptive fields and feature misalignment due to hierarchical downsampling. To address these challenges, we propose PSNet, a novel patch-based self-attention network that significantly expands the receptive field while ensuring feature alignment through a patch-aggregation paradigm. PSNet combines patch-based self-attention feature extraction with common point feature aggregation (CPFA) to implicitly model large-scale spatial relationships. The framework first divides the point cloud into overlapping patches to extract local features via multi-head self-attention, then aggregates features of common points across patches to capture long-range context. Extensive experiments on Toronto-3D and Complex Scene Point Cloud (CSPC) datasets validate PSNet’s state-of-the-art performance, achieving overall accuracies (OAs) of 98.4% and 97.2%, respectively, with significant improvements in challenging categories (e.g., +32.1% IoU for fences). Experimental results on the S3DIS dataset show that PSNet attains competitive mIoU accuracy (71.2%) while maintaining lower inference latency (7.03 s). The PSNet architecture achieves a larger receptive field coverage, which represents a significant advantage over existing methods. This work not only reveals the mechanism of patch-based self-attention for receptive field enhancement but also provides insights into attention-based 3D geometric learning and semantic segmentation architectures. Furthermore, it provides substantial references for applications in autonomous vehicle navigation and smart city infrastructure management.https://www.mdpi.com/2072-4292/17/12/2012patch-based self-attention3D point cloudsemantic segmentationreceptive field
spellingShingle	Hong Yi Yaru Liu Ming Wang PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation Remote Sensing patch-based self-attention 3D point cloud semantic segmentation receptive field
title	PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation
title_full	PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation
title_fullStr	PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation
title_full_unstemmed	PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation
title_short	PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation
title_sort	psnet patch based self attention network for 3d point cloud semantic segmentation
topic	patch-based self-attention 3D point cloud semantic segmentation receptive field
url	https://www.mdpi.com/2072-4292/17/12/2012
work_keys_str_mv	AT hongyi psnetpatchbasedselfattentionnetworkfor3dpointcloudsemanticsegmentation AT yaruliu psnetpatchbasedselfattentionnetworkfor3dpointcloudsemanticsegmentation AT mingwang psnetpatchbasedselfattentionnetworkfor3dpointcloudsemanticsegmentation

PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation

Similar Items