A data driven approach to urban area delineation using multi source geospatial data

Abstract This study introduces a data-driven, bottom-up approach to urban delineation, integrating feature engineering with the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, which represents a significant improvement in precision and methodology compared to traditio...

Full description

Saved in:
Bibliographic Details
Main Authors: Chenyu Fang, Lin Zhou, Xinyue Gu, Xing Liu, Martin Werner
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-93366-x
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract This study introduces a data-driven, bottom-up approach to urban delineation, integrating feature engineering with the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, which represents a significant improvement in precision and methodology compared to traditional approaches that rely on simplistic OpenStreetMap (OSM) road node data aggregations. By employing a broad array of OSM categories and refining data selection through feature engineering, our research significantly enhances the precision and relevance of urban clustering. Using Bavaria, Germany, as a case study, we demonstrate that feature engineering effectively reduces noise and mitigates common DBSCAN clustering pitfalls by filtering out irrelevant and autocorrelated data. The robustness of the proposed method is validated through a comprehensive assessment involving three key elements: (1) a 5% improvement in average accuracy, (2) optimal clustering selections based on entropy values that eliminate the need for prior knowledge, and (3) validation through nighttime light data and Zipf’s law, where a high p-value of 0.99 confirms a good fit, supporting the power law. This study contributes to urban studies by providing a scalable, replicable model that incorporates advanced data processing techniques and multidimensional data sources, supporting improved urban planning and policy-making while effectively delineating urban areas in varied settings.
ISSN:2045-2322