Beyond extraction accuracy: addressing the quality of geographical named entity through advanced recognition and correction models using a modified BERT framework
In the realm of geospatial services and applications, the accuracy of address information is of utmost importance. Traditional methods of data collection, being both labor-intensive and costly, have prompted researchers to turn to Volunteered Geographic Information (VGI) for the extraction of Geogra...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Taylor & Francis Group
2025-05-01
|
| Series: | Geo-spatial Information Science |
| Subjects: | |
| Online Access: | https://www.tandfonline.com/doi/10.1080/10095020.2024.2354229 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849397669640798208 |
|---|---|
| author | Liuchang Xu Jiajun Zhang Chengkun Zhang Xinyu Zheng Zhenhong Du Xingyu Xue |
| author_facet | Liuchang Xu Jiajun Zhang Chengkun Zhang Xinyu Zheng Zhenhong Du Xingyu Xue |
| author_sort | Liuchang Xu |
| collection | DOAJ |
| description | In the realm of geospatial services and applications, the accuracy of address information is of utmost importance. Traditional methods of data collection, being both labor-intensive and costly, have prompted researchers to turn to Volunteered Geographic Information (VGI) for the extraction of Geographical Named Entity (GNE).Notwithstanding, prior studies have predominantly concentrated on enhancing extraction accuracy, while often overlooking the critical aspect of GNE quality. This study addresses this gap by employing a multifaceted approach. Initially, a Geographical Named Entity Semantic Model (GNESM) was constructed by improving the BERT framework and conducting ablation experiments on multiple influencing factors to verify its feasibility. Based on GNESM, a Geographical Named Entity Recognition Model (GNERM) was constructed by incremental pre-training with social media text data and fine-tuning to achieve a recognition accuracy of 90.9%. Subsequently, a Geographical Named Entity Error Correction Model (GNEECM) was constructed by training GNESM with standard GNE data and incorporating error detection and correction modules, achieving a remarkable accuracy of 96.6% in error detection and correction tasks. The experimental results convincingly demonstrate that the proposed identification and correction methods outperform all compared methods. Through the identification and correction process, this study successfully obtained high-quality GNE data, providing a reference for expanding standard address libraries and subsequent research on geographic named entity. |
| format | Article |
| id | doaj-art-3071b78a29644f78a4b95d94cd979ba2 |
| institution | Kabale University |
| issn | 1009-5020 1993-5153 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Taylor & Francis Group |
| record_format | Article |
| series | Geo-spatial Information Science |
| spelling | doaj-art-3071b78a29644f78a4b95d94cd979ba22025-08-20T03:38:55ZengTaylor & Francis GroupGeo-spatial Information Science1009-50201993-51532025-05-012831195121310.1080/10095020.2024.2354229Beyond extraction accuracy: addressing the quality of geographical named entity through advanced recognition and correction models using a modified BERT frameworkLiuchang Xu0Jiajun Zhang1Chengkun Zhang2Xinyu Zheng3Zhenhong Du4Xingyu Xue5School of Mathematics and Computer Science, Zhejiang Agriculture and Forestry University, Hangzhou, ChinaSchool of Mathematics and Computer Science, Zhejiang Agriculture and Forestry University, Hangzhou, ChinaSchool of Earth Sciences, Zhejiang University, Hangzhou, ChinaSchool of Mathematics and Computer Science, Zhejiang Agriculture and Forestry University, Hangzhou, ChinaSchool of Earth Sciences, Zhejiang University, Hangzhou, ChinaSchool of Mathematics and Computer Science, Zhejiang Agriculture and Forestry University, Hangzhou, ChinaIn the realm of geospatial services and applications, the accuracy of address information is of utmost importance. Traditional methods of data collection, being both labor-intensive and costly, have prompted researchers to turn to Volunteered Geographic Information (VGI) for the extraction of Geographical Named Entity (GNE).Notwithstanding, prior studies have predominantly concentrated on enhancing extraction accuracy, while often overlooking the critical aspect of GNE quality. This study addresses this gap by employing a multifaceted approach. Initially, a Geographical Named Entity Semantic Model (GNESM) was constructed by improving the BERT framework and conducting ablation experiments on multiple influencing factors to verify its feasibility. Based on GNESM, a Geographical Named Entity Recognition Model (GNERM) was constructed by incremental pre-training with social media text data and fine-tuning to achieve a recognition accuracy of 90.9%. Subsequently, a Geographical Named Entity Error Correction Model (GNEECM) was constructed by training GNESM with standard GNE data and incorporating error detection and correction modules, achieving a remarkable accuracy of 96.6% in error detection and correction tasks. The experimental results convincingly demonstrate that the proposed identification and correction methods outperform all compared methods. Through the identification and correction process, this study successfully obtained high-quality GNE data, providing a reference for expanding standard address libraries and subsequent research on geographic named entity.https://www.tandfonline.com/doi/10.1080/10095020.2024.2354229Volunteered Geographic Information(VGI)BERTgeographical named entitygeographical named entity recognitionincremental pre-traininggeographical named entity error correction |
| spellingShingle | Liuchang Xu Jiajun Zhang Chengkun Zhang Xinyu Zheng Zhenhong Du Xingyu Xue Beyond extraction accuracy: addressing the quality of geographical named entity through advanced recognition and correction models using a modified BERT framework Geo-spatial Information Science Volunteered Geographic Information(VGI) BERT geographical named entity geographical named entity recognition incremental pre-training geographical named entity error correction |
| title | Beyond extraction accuracy: addressing the quality of geographical named entity through advanced recognition and correction models using a modified BERT framework |
| title_full | Beyond extraction accuracy: addressing the quality of geographical named entity through advanced recognition and correction models using a modified BERT framework |
| title_fullStr | Beyond extraction accuracy: addressing the quality of geographical named entity through advanced recognition and correction models using a modified BERT framework |
| title_full_unstemmed | Beyond extraction accuracy: addressing the quality of geographical named entity through advanced recognition and correction models using a modified BERT framework |
| title_short | Beyond extraction accuracy: addressing the quality of geographical named entity through advanced recognition and correction models using a modified BERT framework |
| title_sort | beyond extraction accuracy addressing the quality of geographical named entity through advanced recognition and correction models using a modified bert framework |
| topic | Volunteered Geographic Information(VGI) BERT geographical named entity geographical named entity recognition incremental pre-training geographical named entity error correction |
| url | https://www.tandfonline.com/doi/10.1080/10095020.2024.2354229 |
| work_keys_str_mv | AT liuchangxu beyondextractionaccuracyaddressingthequalityofgeographicalnamedentitythroughadvancedrecognitionandcorrectionmodelsusingamodifiedbertframework AT jiajunzhang beyondextractionaccuracyaddressingthequalityofgeographicalnamedentitythroughadvancedrecognitionandcorrectionmodelsusingamodifiedbertframework AT chengkunzhang beyondextractionaccuracyaddressingthequalityofgeographicalnamedentitythroughadvancedrecognitionandcorrectionmodelsusingamodifiedbertframework AT xinyuzheng beyondextractionaccuracyaddressingthequalityofgeographicalnamedentitythroughadvancedrecognitionandcorrectionmodelsusingamodifiedbertframework AT zhenhongdu beyondextractionaccuracyaddressingthequalityofgeographicalnamedentitythroughadvancedrecognitionandcorrectionmodelsusingamodifiedbertframework AT xingyuxue beyondextractionaccuracyaddressingthequalityofgeographicalnamedentitythroughadvancedrecognitionandcorrectionmodelsusingamodifiedbertframework |