PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation

The application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However,...

Full description

Saved in:
Bibliographic Details
Main Authors: Yang Luo, Ting Han, Xiaorong Zhang, Yujun Liu, Duxin Zhu, Jinyuan Li, Yiping Chen, Yundong Wu, Guorong Cai, Yingchao Piao, Jinhe Su
Format: Article
Language:English
Published: Taylor & Francis Group 2025-08-01
Series:International Journal of Digital Earth
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849224398639202304
author Yang Luo
Ting Han
Xiaorong Zhang
Yujun Liu
Duxin Zhu
Jinyuan Li
Yiping Chen
Yundong Wu
Guorong Cai
Yingchao Piao
Jinhe Su
author_facet Yang Luo
Ting Han
Xiaorong Zhang
Yujun Liu
Duxin Zhu
Jinyuan Li
Yiping Chen
Yundong Wu
Guorong Cai
Yingchao Piao
Jinhe Su
author_sort Yang Luo
collection DOAJ
description The application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However, existing methods face challenges when handling long-range LiDAR point cloud, where reduced point density and increased noise at greater distances result in segmentation errors and diminished accuracy. To this end, we propose PASeg, which incorporates two key components: the Positional-Guided Classifier (PGC) and the Multimodal Semantic Alignment (MSA) module. The PGC uses positional embeddings to dynamically adjust normalization parameters, thereby improving segmentation accuracy across varying distances. The MSA module aligns semantic features from text, image, and point cloud data, facilitating better category differentiation. The interaction between PGC and MSA strengthens large-scale 3D semantic segmentation synergistically. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that PASeg’s overall segmentation performance is competitive with state-of-the-art methods. Notably, our method achieves a significant improvement of over 2.3% and 1.7% in long-range LiDAR point cloud segmentation (30–40 m and 40–50 m, respectively) compared to the baseline segmenter on the SemanticKITTI dataset. PASeg improves urban segmentation for smart, sustainable city development.
format Article
id doaj-art-daa17b15dfdf449da3fa1252be7efdea
institution Kabale University
issn 1753-8947
1753-8955
language English
publishDate 2025-08-01
publisher Taylor & Francis Group
record_format Article
series International Journal of Digital Earth
spelling doaj-art-daa17b15dfdf449da3fa1252be7efdea2025-08-25T11:24:58ZengTaylor & Francis GroupInternational Journal of Digital Earth1753-89471753-89552025-08-0118110.1080/17538947.2025.2528811PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentationYang Luo0Ting Han1Xiaorong Zhang2Yujun Liu3Duxin Zhu4Jinyuan Li5Yiping Chen6Yundong Wu7Guorong Cai8Yingchao Piao9Jinhe Su10School of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Architecture and Urban Planning, Shenzhen University, Shenzhen, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaThe application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However, existing methods face challenges when handling long-range LiDAR point cloud, where reduced point density and increased noise at greater distances result in segmentation errors and diminished accuracy. To this end, we propose PASeg, which incorporates two key components: the Positional-Guided Classifier (PGC) and the Multimodal Semantic Alignment (MSA) module. The PGC uses positional embeddings to dynamically adjust normalization parameters, thereby improving segmentation accuracy across varying distances. The MSA module aligns semantic features from text, image, and point cloud data, facilitating better category differentiation. The interaction between PGC and MSA strengthens large-scale 3D semantic segmentation synergistically. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that PASeg’s overall segmentation performance is competitive with state-of-the-art methods. Notably, our method achieves a significant improvement of over 2.3% and 1.7% in long-range LiDAR point cloud segmentation (30–40 m and 40–50 m, respectively) compared to the baseline segmenter on the SemanticKITTI dataset. PASeg improves urban segmentation for smart, sustainable city development.https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811Point cloudsemantic segmentationurban scenemultimodalpositional guidedsemantic alignment
spellingShingle Yang Luo
Ting Han
Xiaorong Zhang
Yujun Liu
Duxin Zhu
Jinyuan Li
Yiping Chen
Yundong Wu
Guorong Cai
Yingchao Piao
Jinhe Su
PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
International Journal of Digital Earth
Point cloud
semantic segmentation
urban scene
multimodal
positional guided
semantic alignment
title PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_full PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_fullStr PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_full_unstemmed PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_short PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_sort paseg positional guided segmenter with multimodal semantic alignment for enhancing urban scene 3d semantic segmentation
topic Point cloud
semantic segmentation
urban scene
multimodal
positional guided
semantic alignment
url https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811
work_keys_str_mv AT yangluo pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT tinghan pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT xiaorongzhang pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT yujunliu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT duxinzhu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT jinyuanli pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT yipingchen pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT yundongwu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT guorongcai pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT yingchaopiao pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation
AT jinhesu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation