GEE-OPs: an operator knowledge base for geospatial code generation on the Google Earth Engine platform powered by large language models
As spatiotemporal data grows in complexity, utilizing geospatial modeling on the Google Earth Engine (GEE) platform poses challenges in improving coding efficiency for experts and enhancing the coding capabilities of interdisciplinary users. To address these challenges, we propose a framework for co...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Taylor & Francis Group
2025-05-01
|
| Series: | Geo-spatial Information Science |
| Subjects: | |
| Online Access: | https://www.tandfonline.com/doi/10.1080/10095020.2025.2505556 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849717280707969024 |
|---|---|
| author | Shuyang Hou Jianyuan Liang Anqi Zhao Huayi Wu |
| author_facet | Shuyang Hou Jianyuan Liang Anqi Zhao Huayi Wu |
| author_sort | Shuyang Hou |
| collection | DOAJ |
| description | As spatiotemporal data grows in complexity, utilizing geospatial modeling on the Google Earth Engine (GEE) platform poses challenges in improving coding efficiency for experts and enhancing the coding capabilities of interdisciplinary users. To address these challenges, we propose a framework for constructing a geospatial operator knowledge base tailored to the GEE JavaScript API. The framework includes an operator syntax knowledge table, an operator relationship frequency knowledge table, an operator frequent pattern knowledge table, and an operator relationship chain knowledge table. Leveraging Abstract Syntax Tree (AST) techniques and frequent itemset mining, we extract operator knowledge from 295,943 real GEE scripts and syntax documentation, forming a structured knowledge base. Experimental results demonstrate that the proposed framework achieves an accuracy ranging from 87% to 93% in operator relationship extraction tasks, measured by accuracy, recall, and F1 score metrics. In operator relationship chain extraction tasks, the framework achieves a performance range of 0.79 to 0.89 across LCS, Ngram, Siamese, and BERT-based evaluations. In geospatial code generation tasks, GEE-OPs improves the executability of mainstream Large Language Models (LLMs) by 38.0% to 44.9%, enhances correctness by 24.1% to 47.2%, and boosts readability by 4.7% to 7.6%. Ablation experiments further validate the essential role of each knowledge table in enhancing model performance. Additionally, key performance indicators – including response time, lines of code, token consumption, and memory usage – are documented to assist readers in replicating the experiments and gaining deeper insights into system performance. This work advances geospatial code modeling techniques and facilitates the application of LLMs in geoinformatics, contributing to the integration of generative AI into the field. |
| format | Article |
| id | doaj-art-8478afa08fe44d70bd9e62bdb253998d |
| institution | DOAJ |
| issn | 1009-5020 1993-5153 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Taylor & Francis Group |
| record_format | Article |
| series | Geo-spatial Information Science |
| spelling | doaj-art-8478afa08fe44d70bd9e62bdb253998d2025-08-20T03:12:42ZengTaylor & Francis GroupGeo-spatial Information Science1009-50201993-51532025-05-0112210.1080/10095020.2025.2505556GEE-OPs: an operator knowledge base for geospatial code generation on the Google Earth Engine platform powered by large language modelsShuyang Hou0Jianyuan Liang1Anqi Zhao2Huayi Wu3State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, ChinaState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, ChinaState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, ChinaState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, ChinaAs spatiotemporal data grows in complexity, utilizing geospatial modeling on the Google Earth Engine (GEE) platform poses challenges in improving coding efficiency for experts and enhancing the coding capabilities of interdisciplinary users. To address these challenges, we propose a framework for constructing a geospatial operator knowledge base tailored to the GEE JavaScript API. The framework includes an operator syntax knowledge table, an operator relationship frequency knowledge table, an operator frequent pattern knowledge table, and an operator relationship chain knowledge table. Leveraging Abstract Syntax Tree (AST) techniques and frequent itemset mining, we extract operator knowledge from 295,943 real GEE scripts and syntax documentation, forming a structured knowledge base. Experimental results demonstrate that the proposed framework achieves an accuracy ranging from 87% to 93% in operator relationship extraction tasks, measured by accuracy, recall, and F1 score metrics. In operator relationship chain extraction tasks, the framework achieves a performance range of 0.79 to 0.89 across LCS, Ngram, Siamese, and BERT-based evaluations. In geospatial code generation tasks, GEE-OPs improves the executability of mainstream Large Language Models (LLMs) by 38.0% to 44.9%, enhances correctness by 24.1% to 47.2%, and boosts readability by 4.7% to 7.6%. Ablation experiments further validate the essential role of each knowledge table in enhancing model performance. Additionally, key performance indicators – including response time, lines of code, token consumption, and memory usage – are documented to assist readers in replicating the experiments and gaining deeper insights into system performance. This work advances geospatial code modeling techniques and facilitates the application of LLMs in geoinformatics, contributing to the integration of generative AI into the field.https://www.tandfonline.com/doi/10.1080/10095020.2025.2505556Geospatial operator knowledge baseGoogle Earth Enginelarge language modelsgeospatial code generationabstract syntax tree |
| spellingShingle | Shuyang Hou Jianyuan Liang Anqi Zhao Huayi Wu GEE-OPs: an operator knowledge base for geospatial code generation on the Google Earth Engine platform powered by large language models Geo-spatial Information Science Geospatial operator knowledge base Google Earth Engine large language models geospatial code generation abstract syntax tree |
| title | GEE-OPs: an operator knowledge base for geospatial code generation on the Google Earth Engine platform powered by large language models |
| title_full | GEE-OPs: an operator knowledge base for geospatial code generation on the Google Earth Engine platform powered by large language models |
| title_fullStr | GEE-OPs: an operator knowledge base for geospatial code generation on the Google Earth Engine platform powered by large language models |
| title_full_unstemmed | GEE-OPs: an operator knowledge base for geospatial code generation on the Google Earth Engine platform powered by large language models |
| title_short | GEE-OPs: an operator knowledge base for geospatial code generation on the Google Earth Engine platform powered by large language models |
| title_sort | gee ops an operator knowledge base for geospatial code generation on the google earth engine platform powered by large language models |
| topic | Geospatial operator knowledge base Google Earth Engine large language models geospatial code generation abstract syntax tree |
| url | https://www.tandfonline.com/doi/10.1080/10095020.2025.2505556 |
| work_keys_str_mv | AT shuyanghou geeopsanoperatorknowledgebaseforgeospatialcodegenerationonthegoogleearthengineplatformpoweredbylargelanguagemodels AT jianyuanliang geeopsanoperatorknowledgebaseforgeospatialcodegenerationonthegoogleearthengineplatformpoweredbylargelanguagemodels AT anqizhao geeopsanoperatorknowledgebaseforgeospatialcodegenerationonthegoogleearthengineplatformpoweredbylargelanguagemodels AT huayiwu geeopsanoperatorknowledgebaseforgeospatialcodegenerationonthegoogleearthengineplatformpoweredbylargelanguagemodels |