PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation
LiDAR-captured 3D point clouds are widely used in self-driving cars and smart cities. Point-based semantic segmentation methods allow for more efficient use of the rich geometric information contained in 3D point clouds, so it has gradually replaced other methods as the mainstream deep learning meth...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/12/2012 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849426005509275648 |
|---|---|
| author | Hong Yi Yaru Liu Ming Wang |
| author_facet | Hong Yi Yaru Liu Ming Wang |
| author_sort | Hong Yi |
| collection | DOAJ |
| description | LiDAR-captured 3D point clouds are widely used in self-driving cars and smart cities. Point-based semantic segmentation methods allow for more efficient use of the rich geometric information contained in 3D point clouds, so it has gradually replaced other methods as the mainstream deep learning method in 3D point cloud semantic segmentation. However, existing methods suffer from limited receptive fields and feature misalignment due to hierarchical downsampling. To address these challenges, we propose PSNet, a novel patch-based self-attention network that significantly expands the receptive field while ensuring feature alignment through a patch-aggregation paradigm. PSNet combines patch-based self-attention feature extraction with common point feature aggregation (CPFA) to implicitly model large-scale spatial relationships. The framework first divides the point cloud into overlapping patches to extract local features via multi-head self-attention, then aggregates features of common points across patches to capture long-range context. Extensive experiments on Toronto-3D and Complex Scene Point Cloud (CSPC) datasets validate PSNet’s state-of-the-art performance, achieving overall accuracies (OAs) of 98.4% and 97.2%, respectively, with significant improvements in challenging categories (e.g., +32.1% IoU for fences). Experimental results on the S3DIS dataset show that PSNet attains competitive mIoU accuracy (71.2%) while maintaining lower inference latency (7.03 s). The PSNet architecture achieves a larger receptive field coverage, which represents a significant advantage over existing methods. This work not only reveals the mechanism of patch-based self-attention for receptive field enhancement but also provides insights into attention-based 3D geometric learning and semantic segmentation architectures. Furthermore, it provides substantial references for applications in autonomous vehicle navigation and smart city infrastructure management. |
| format | Article |
| id | doaj-art-261260dc7a6940c8bc5e9bf06f35eb77 |
| institution | Kabale University |
| issn | 2072-4292 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Remote Sensing |
| spelling | doaj-art-261260dc7a6940c8bc5e9bf06f35eb772025-08-20T03:29:35ZengMDPI AGRemote Sensing2072-42922025-06-011712201210.3390/rs17122012PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic SegmentationHong Yi0Yaru Liu1Ming Wang2College of Geographical Sciences, Harbin Normal University, Harbin 150025, ChinaGuangdong Urban-Rural Planning and Design Research Institute Technology Group Co., Ltd., Guangzhou 510290, ChinaInspur Cloud Information Technology Co., Ltd., Jinan 250101, ChinaLiDAR-captured 3D point clouds are widely used in self-driving cars and smart cities. Point-based semantic segmentation methods allow for more efficient use of the rich geometric information contained in 3D point clouds, so it has gradually replaced other methods as the mainstream deep learning method in 3D point cloud semantic segmentation. However, existing methods suffer from limited receptive fields and feature misalignment due to hierarchical downsampling. To address these challenges, we propose PSNet, a novel patch-based self-attention network that significantly expands the receptive field while ensuring feature alignment through a patch-aggregation paradigm. PSNet combines patch-based self-attention feature extraction with common point feature aggregation (CPFA) to implicitly model large-scale spatial relationships. The framework first divides the point cloud into overlapping patches to extract local features via multi-head self-attention, then aggregates features of common points across patches to capture long-range context. Extensive experiments on Toronto-3D and Complex Scene Point Cloud (CSPC) datasets validate PSNet’s state-of-the-art performance, achieving overall accuracies (OAs) of 98.4% and 97.2%, respectively, with significant improvements in challenging categories (e.g., +32.1% IoU for fences). Experimental results on the S3DIS dataset show that PSNet attains competitive mIoU accuracy (71.2%) while maintaining lower inference latency (7.03 s). The PSNet architecture achieves a larger receptive field coverage, which represents a significant advantage over existing methods. This work not only reveals the mechanism of patch-based self-attention for receptive field enhancement but also provides insights into attention-based 3D geometric learning and semantic segmentation architectures. Furthermore, it provides substantial references for applications in autonomous vehicle navigation and smart city infrastructure management.https://www.mdpi.com/2072-4292/17/12/2012patch-based self-attention3D point cloudsemantic segmentationreceptive field |
| spellingShingle | Hong Yi Yaru Liu Ming Wang PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation Remote Sensing patch-based self-attention 3D point cloud semantic segmentation receptive field |
| title | PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation |
| title_full | PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation |
| title_fullStr | PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation |
| title_full_unstemmed | PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation |
| title_short | PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation |
| title_sort | psnet patch based self attention network for 3d point cloud semantic segmentation |
| topic | patch-based self-attention 3D point cloud semantic segmentation receptive field |
| url | https://www.mdpi.com/2072-4292/17/12/2012 |
| work_keys_str_mv | AT hongyi psnetpatchbasedselfattentionnetworkfor3dpointcloudsemanticsegmentation AT yaruliu psnetpatchbasedselfattentionnetworkfor3dpointcloudsemanticsegmentation AT mingwang psnetpatchbasedselfattentionnetworkfor3dpointcloudsemanticsegmentation |