Robust multi-source geographic entities matching by maximizing geometric and semantic similarity
Abstract Geographic entity matching is an important means for multi-source spatial data fusion and information association and sharing. Corresponding matching methods have been designed by existing studies for different types of entity data characteristics, such as line and area. However, these appr...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-12-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-024-79812-2 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841559459339436032 |
---|---|
author | YuHan Yan PengDa Wu Yong Yin PeiPei Guo |
author_facet | YuHan Yan PengDa Wu Yong Yin PeiPei Guo |
author_sort | YuHan Yan |
collection | DOAJ |
description | Abstract Geographic entity matching is an important means for multi-source spatial data fusion and information association and sharing. Corresponding matching methods have been designed by existing studies for different types of entity data characteristics, such as line and area. However, these approaches are often limited in the generalization ability for matching heterogeneous data from multiple sources and the accuracy for complex pattern matching. To resolve these problems, robust multi-source geographic entities matching by maximizing geometric and semantic similarity is proposed. First, the entire entity is segmented based on shape features, and the partitioned feature segments are extracted as matching primitives; Second, feature segments are grouped into patterns, encompassing three major categories and fourteen subcategories; Following this, pattern matching is performed based on spatial similarity metric such as maximum projection distance, etc.; Finally, the spatial matches are detected and refined through semantic similarity calculation. The proposed method is tested using two datasets from regions in southeast and northwest China. The experimental results demonstrate that our method can be effectively applied to both area and line entity matching with strong generalization and application capability and significantly improved matching accuracy. Specifically, nine feature segment matching patterns for matching area entities and six for line entities are utilized, and the precision and recall are nearly 90%. |
format | Article |
id | doaj-art-ae68a4c552d243c1aa3594ad610ae6f2 |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2024-12-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-ae68a4c552d243c1aa3594ad610ae6f22025-01-05T12:28:14ZengNature PortfolioScientific Reports2045-23222024-12-0114111710.1038/s41598-024-79812-2Robust multi-source geographic entities matching by maximizing geometric and semantic similarityYuHan Yan0PengDa Wu1Yong Yin2PeiPei Guo3Department of Geographic Information System, Chinese Academy of Surveying and mappingDepartment of Geographic Information System, Chinese Academy of Surveying and mappingDepartment of Geographic Information System, Chinese Academy of Surveying and mappingDepartment of Geographic Information System, Chinese Academy of Surveying and mappingAbstract Geographic entity matching is an important means for multi-source spatial data fusion and information association and sharing. Corresponding matching methods have been designed by existing studies for different types of entity data characteristics, such as line and area. However, these approaches are often limited in the generalization ability for matching heterogeneous data from multiple sources and the accuracy for complex pattern matching. To resolve these problems, robust multi-source geographic entities matching by maximizing geometric and semantic similarity is proposed. First, the entire entity is segmented based on shape features, and the partitioned feature segments are extracted as matching primitives; Second, feature segments are grouped into patterns, encompassing three major categories and fourteen subcategories; Following this, pattern matching is performed based on spatial similarity metric such as maximum projection distance, etc.; Finally, the spatial matches are detected and refined through semantic similarity calculation. The proposed method is tested using two datasets from regions in southeast and northwest China. The experimental results demonstrate that our method can be effectively applied to both area and line entity matching with strong generalization and application capability and significantly improved matching accuracy. Specifically, nine feature segment matching patterns for matching area entities and six for line entities are utilized, and the precision and recall are nearly 90%.https://doi.org/10.1038/s41598-024-79812-2Area entityLine entityFeature segmentPattern recognitionSemantic similarity |
spellingShingle | YuHan Yan PengDa Wu Yong Yin PeiPei Guo Robust multi-source geographic entities matching by maximizing geometric and semantic similarity Scientific Reports Area entity Line entity Feature segment Pattern recognition Semantic similarity |
title | Robust multi-source geographic entities matching by maximizing geometric and semantic similarity |
title_full | Robust multi-source geographic entities matching by maximizing geometric and semantic similarity |
title_fullStr | Robust multi-source geographic entities matching by maximizing geometric and semantic similarity |
title_full_unstemmed | Robust multi-source geographic entities matching by maximizing geometric and semantic similarity |
title_short | Robust multi-source geographic entities matching by maximizing geometric and semantic similarity |
title_sort | robust multi source geographic entities matching by maximizing geometric and semantic similarity |
topic | Area entity Line entity Feature segment Pattern recognition Semantic similarity |
url | https://doi.org/10.1038/s41598-024-79812-2 |
work_keys_str_mv | AT yuhanyan robustmultisourcegeographicentitiesmatchingbymaximizinggeometricandsemanticsimilarity AT pengdawu robustmultisourcegeographicentitiesmatchingbymaximizinggeometricandsemanticsimilarity AT yongyin robustmultisourcegeographicentitiesmatchingbymaximizinggeometricandsemanticsimilarity AT peipeiguo robustmultisourcegeographicentitiesmatchingbymaximizinggeometricandsemanticsimilarity |