Intelligent segmentation of Chinese address elements combining textual and spatial semantic features

Chinese address segmentation is a critical step in tasks such as address parsing, address standardization, address matching, and geocoding. However, existing research has largely overlooked the spatial characteristics of address elements, thereby limiting the potential for further performance enhanc...

Full description

Saved in:
Bibliographic Details
Main Authors: Xuefeng Yan, An Luo, Jiping Liu, Yong Wang, Ya Zhang
Format: Article
Language:English
Published: Taylor & Francis Group 2025-08-01
Series:International Journal of Digital Earth
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/17538947.2025.2481142
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Chinese address segmentation is a critical step in tasks such as address parsing, address standardization, address matching, and geocoding. However, existing research has largely overlooked the spatial characteristics of address elements, thereby limiting the potential for further performance enhancement. This paper integrates both textual and spatial semantic features and proposes a framework for intelligent segmentation of Chinese addresses via multi-semantic features (FISCA-MS). The framework leverages the pooled output of the pretrained language model MacBERT to represent the textual semantic features of address elements and introduces a novel full-scale clustering evaluation method (FSCE) to compute the spatial semantic features. By combining textual and spatial semantic features, the proposed approach uses a bidirectional gated recurrent unit (BiGRU) neural network to automatically segment Chinese address elements. The experimental results show that FISCA-MS outperforms FISCA, which uses only textual semantic features (FISCA-TS), and FISCA, which uses only spatial semantic features (FISCA-SS), with accuracy improvements of 0.725% and 38.05%, respectively.
ISSN:1753-8947
1753-8955