PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation

The application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However,...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yang Luo, Ting Han, Xiaorong Zhang, Yujun Liu, Duxin Zhu, Jinyuan Li, Yiping Chen, Yundong Wu, Guorong Cai, Yingchao Piao, Jinhe Su
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2025-08-01
Series:	International Journal of Digital Earth
Subjects:	Point cloud semantic segmentation urban scene multimodal positional guided semantic alignment
Online Access:	https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849224398639202304
author	Yang Luo Ting Han Xiaorong Zhang Yujun Liu Duxin Zhu Jinyuan Li Yiping Chen Yundong Wu Guorong Cai Yingchao Piao Jinhe Su
author_facet	Yang Luo Ting Han Xiaorong Zhang Yujun Liu Duxin Zhu Jinyuan Li Yiping Chen Yundong Wu Guorong Cai Yingchao Piao Jinhe Su
author_sort	Yang Luo
collection	DOAJ
description	The application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However, existing methods face challenges when handling long-range LiDAR point cloud, where reduced point density and increased noise at greater distances result in segmentation errors and diminished accuracy. To this end, we propose PASeg, which incorporates two key components: the Positional-Guided Classifier (PGC) and the Multimodal Semantic Alignment (MSA) module. The PGC uses positional embeddings to dynamically adjust normalization parameters, thereby improving segmentation accuracy across varying distances. The MSA module aligns semantic features from text, image, and point cloud data, facilitating better category differentiation. The interaction between PGC and MSA strengthens large-scale 3D semantic segmentation synergistically. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that PASeg’s overall segmentation performance is competitive with state-of-the-art methods. Notably, our method achieves a significant improvement of over 2.3% and 1.7% in long-range LiDAR point cloud segmentation (30–40 m and 40–50 m, respectively) compared to the baseline segmenter on the SemanticKITTI dataset. PASeg improves urban segmentation for smart, sustainable city development.
format	Article
id	doaj-art-daa17b15dfdf449da3fa1252be7efdea
institution	Kabale University
issn	1753-8947 1753-8955
language	English
publishDate	2025-08-01
publisher	Taylor & Francis Group
record_format	Article
series	International Journal of Digital Earth
spelling	doaj-art-daa17b15dfdf449da3fa1252be7efdea2025-08-25T11:24:58ZengTaylor & Francis GroupInternational Journal of Digital Earth1753-89471753-89552025-08-0118110.1080/17538947.2025.2528811PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentationYang Luo0Ting Han1Xiaorong Zhang2Yujun Liu3Duxin Zhu4Jinyuan Li5Yiping Chen6Yundong Wu7Guorong Cai8Yingchao Piao9Jinhe Su10School of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Architecture and Urban Planning, Shenzhen University, Shenzhen, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaThe application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However, existing methods face challenges when handling long-range LiDAR point cloud, where reduced point density and increased noise at greater distances result in segmentation errors and diminished accuracy. To this end, we propose PASeg, which incorporates two key components: the Positional-Guided Classifier (PGC) and the Multimodal Semantic Alignment (MSA) module. The PGC uses positional embeddings to dynamically adjust normalization parameters, thereby improving segmentation accuracy across varying distances. The MSA module aligns semantic features from text, image, and point cloud data, facilitating better category differentiation. The interaction between PGC and MSA strengthens large-scale 3D semantic segmentation synergistically. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that PASeg’s overall segmentation performance is competitive with state-of-the-art methods. Notably, our method achieves a significant improvement of over 2.3% and 1.7% in long-range LiDAR point cloud segmentation (30–40 m and 40–50 m, respectively) compared to the baseline segmenter on the SemanticKITTI dataset. PASeg improves urban segmentation for smart, sustainable city development.https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811Point cloudsemantic segmentationurban scenemultimodalpositional guidedsemantic alignment
spellingShingle	Yang Luo Ting Han Xiaorong Zhang Yujun Liu Duxin Zhu Jinyuan Li Yiping Chen Yundong Wu Guorong Cai Yingchao Piao Jinhe Su PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation International Journal of Digital Earth Point cloud semantic segmentation urban scene multimodal positional guided semantic alignment
title	PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_full	PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_fullStr	PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_full_unstemmed	PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_short	PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
title_sort	paseg positional guided segmenter with multimodal semantic alignment for enhancing urban scene 3d semantic segmentation
topic	Point cloud semantic segmentation urban scene multimodal positional guided semantic alignment
url	https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811
work_keys_str_mv	AT yangluo pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT tinghan pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT xiaorongzhang pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT yujunliu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT duxinzhu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT jinyuanli pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT yipingchen pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT yundongwu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT guorongcai pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT yingchaopiao pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT jinhesu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation

PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation

Similar Items