Text Geolocation Prediction via Self-Supervised Learning

Text geolocation prediction aims to infer the geographic location of text with text semantics, serving as a fundamental task for various geographic applications. As the mainstream approach, the deep learning-based methods follow the supervised learning paradigms, which rely heavily on a large amount...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuxing Wu, Zhuang Zeng, Kaiyue Liu, Zhouzheng Xu, Yaqin Ye, Shunping Zhou, Huangbao Yao, Shengwen Li
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:https://www.mdpi.com/2220-9964/14/4/170
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Text geolocation prediction aims to infer the geographic location of text with text semantics, serving as a fundamental task for various geographic applications. As the mainstream approach, the deep learning-based methods follow the supervised learning paradigms, which rely heavily on a large amount of labeled samples to train model parameters. To address this limitation, this paper presents a method for text geolocation prediction without labeled samples, namely GeoSG (Geographic Self-Supervised Geolocation) model, which leverages self-supervised learning to improve text geolocation prediction in situations where labeled samples are unavailable. Specifically, GeoSG integrates spatial distance and hierarchical constraints to characterize the interactions of POIs and text in a geographic relationship graph. And it designs two self-supervised tasks to train a shared network to learn the relationships among POIs and texts. Finally, the text geolocations are inferred based on the trained shared network. Experimental results on two datasets show that the proposed method outperforms the state-of-the-art baselines and is robust. This study provides a methodological reference for geolocating various text documents and offers a solution for numerous geographic intelligence tasks that lack labeled samples.
ISSN:2220-9964