CUGUV: A Benchmark Dataset for Promoting Large-Scale Urban Village Mapping with Deep Learning Models
Abstract Delineating the extent of urban villages (UVs) is crucial for effective urban planning and management, as well as for providing targeted policy and financial support. Unlike field surveys, the interpretation of satellite imagery provides an efficient, near real-time, and objective means of...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-03-01
|
| Series: | Scientific Data |
| Online Access: | https://doi.org/10.1038/s41597-025-04701-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Delineating the extent of urban villages (UVs) is crucial for effective urban planning and management, as well as for providing targeted policy and financial support. Unlike field surveys, the interpretation of satellite imagery provides an efficient, near real-time, and objective means of mapping UV. However, current research efforts predominantly concentrate on individual cities, resulting in a scarcity of interpretable UV maps for numerous other cities. This gap in availability not only hinders public awareness of the distribution and evolution of UV but also limits the reliability and transferability of models due to the insufficient number and diversity of samples. To address this issue, we developed CUGUV, a benchmark dataset that includes a diverse collection of thousands of UV samples, carefully curated from 15 major cities across various geographical regions in China. The dataset can be accessed through this link: https://doi.org/10.6084/m9.figshare.26198093 . This dataset can serve as a foundation for evaluating and improving the robustness and transferability of models. Subsequently, we present an innovative framework that effectively integrates and learns from multiple data sources to better address the cross-city UV mapping task. Tests show that the proposed models achieve over 92% in overall accuracy, precision, and F1-scores, outperforming state-of-the-art models. This highlights the effectiveness of both the proposed dataset and model. This presented dataset and model bolsters our capability to better understand and accurately model these complex and diverse phenomena, ultimately leading to a notable improvement in the performance of large-scale UV mapping. |
|---|---|
| ISSN: | 2052-4463 |