PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation
The application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However,...
Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Taylor & Francis Group
2025-08-01
|
| Series: | International Journal of Digital Earth |
| Subjects: | |
| Online Access: | https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849224398639202304 |
|---|---|
| author | Yang Luo Ting Han Xiaorong Zhang Yujun Liu Duxin Zhu Jinyuan Li Yiping Chen Yundong Wu Guorong Cai Yingchao Piao Jinhe Su |
| author_facet | Yang Luo Ting Han Xiaorong Zhang Yujun Liu Duxin Zhu Jinyuan Li Yiping Chen Yundong Wu Guorong Cai Yingchao Piao Jinhe Su |
| author_sort | Yang Luo |
| collection | DOAJ |
| description | The application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However, existing methods face challenges when handling long-range LiDAR point cloud, where reduced point density and increased noise at greater distances result in segmentation errors and diminished accuracy. To this end, we propose PASeg, which incorporates two key components: the Positional-Guided Classifier (PGC) and the Multimodal Semantic Alignment (MSA) module. The PGC uses positional embeddings to dynamically adjust normalization parameters, thereby improving segmentation accuracy across varying distances. The MSA module aligns semantic features from text, image, and point cloud data, facilitating better category differentiation. The interaction between PGC and MSA strengthens large-scale 3D semantic segmentation synergistically. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that PASeg’s overall segmentation performance is competitive with state-of-the-art methods. Notably, our method achieves a significant improvement of over 2.3% and 1.7% in long-range LiDAR point cloud segmentation (30–40 m and 40–50 m, respectively) compared to the baseline segmenter on the SemanticKITTI dataset. PASeg improves urban segmentation for smart, sustainable city development. |
| format | Article |
| id | doaj-art-daa17b15dfdf449da3fa1252be7efdea |
| institution | Kabale University |
| issn | 1753-8947 1753-8955 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Taylor & Francis Group |
| record_format | Article |
| series | International Journal of Digital Earth |
| spelling | doaj-art-daa17b15dfdf449da3fa1252be7efdea2025-08-25T11:24:58ZengTaylor & Francis GroupInternational Journal of Digital Earth1753-89471753-89552025-08-0118110.1080/17538947.2025.2528811PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentationYang Luo0Ting Han1Xiaorong Zhang2Yujun Liu3Duxin Zhu4Jinyuan Li5Yiping Chen6Yundong Wu7Guorong Cai8Yingchao Piao9Jinhe Su10School of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Architecture and Urban Planning, Shenzhen University, Shenzhen, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing, People’s Republic of ChinaSchool of Computer Engineering, Jimei University, Xiamen, People’s Republic of ChinaThe application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However, existing methods face challenges when handling long-range LiDAR point cloud, where reduced point density and increased noise at greater distances result in segmentation errors and diminished accuracy. To this end, we propose PASeg, which incorporates two key components: the Positional-Guided Classifier (PGC) and the Multimodal Semantic Alignment (MSA) module. The PGC uses positional embeddings to dynamically adjust normalization parameters, thereby improving segmentation accuracy across varying distances. The MSA module aligns semantic features from text, image, and point cloud data, facilitating better category differentiation. The interaction between PGC and MSA strengthens large-scale 3D semantic segmentation synergistically. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that PASeg’s overall segmentation performance is competitive with state-of-the-art methods. Notably, our method achieves a significant improvement of over 2.3% and 1.7% in long-range LiDAR point cloud segmentation (30–40 m and 40–50 m, respectively) compared to the baseline segmenter on the SemanticKITTI dataset. PASeg improves urban segmentation for smart, sustainable city development.https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811Point cloudsemantic segmentationurban scenemultimodalpositional guidedsemantic alignment |
| spellingShingle | Yang Luo Ting Han Xiaorong Zhang Yujun Liu Duxin Zhu Jinyuan Li Yiping Chen Yundong Wu Guorong Cai Yingchao Piao Jinhe Su PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation International Journal of Digital Earth Point cloud semantic segmentation urban scene multimodal positional guided semantic alignment |
| title | PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation |
| title_full | PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation |
| title_fullStr | PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation |
| title_full_unstemmed | PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation |
| title_short | PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation |
| title_sort | paseg positional guided segmenter with multimodal semantic alignment for enhancing urban scene 3d semantic segmentation |
| topic | Point cloud semantic segmentation urban scene multimodal positional guided semantic alignment |
| url | https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811 |
| work_keys_str_mv | AT yangluo pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT tinghan pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT xiaorongzhang pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT yujunliu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT duxinzhu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT jinyuanli pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT yipingchen pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT yundongwu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT guorongcai pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT yingchaopiao pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation AT jinhesu pasegpositionalguidedsegmenterwithmultimodalsemanticalignmentforenhancingurbanscene3dsemanticsegmentation |