SceneDiffusion: Scene Generation Model Embedded with Spatial Constraints
Spatial scenes, as fundamental units of geospatial cognition, encompass rich objects and spatial relationships, and their generation techniques hold significant application value in disaster simulation and emergency drills, delayed spatial reconstruction and analysis, and other fields. However, exis...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | ISPRS International Journal of Geo-Information |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2220-9964/14/7/250 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Spatial scenes, as fundamental units of geospatial cognition, encompass rich objects and spatial relationships, and their generation techniques hold significant application value in disaster simulation and emergency drills, delayed spatial reconstruction and analysis, and other fields. However, existing studies still face limitations in modeling complex spatial relationships during scene generation, leading to insufficient semantic consistency and geographical accuracy. The advancement of Geospatial Artificial Intelligence (GeoAI) offers a new technical pathway for the intelligent modeling of spatial scenes. Against this backdrop, we propose SceneDiffusion, a scene generation model embedded with spatial constraints, and construct a geospatial scene dataset incorporating spatial relationship descriptions and geographic semantics, aiming to enhance the understanding and modeling capabilities of GeoAI models for spatial information. Specifically, SceneDiffusion employs a spatial scene representation framework to uniformly characterize objects and their topological, directional, and distance relationships, enhances the interactive modeling of objects and relationships through a Spatial relationship Attention-aware Graph (SAG) module, and finally generates high-quality scene images conforming to geographic semantics using a Layout information-guided Conditional Diffusion (LCD) module. Both qualitative and quantitative experiments demonstrate the superiority of SceneDiffusion, achieving a 56.6% reduction in FID and a 35.3% improvement in SSIM compared to baseline methods. Ablation studies confirm the importance of multi-relational modeling with attention mechanisms. By generating scenes that satisfy spatial distribution constraints, this work provides technical support for applications such as emergency scene simulation and virtual scene construction, while also offering insights for theoretical research and methodological innovation in GeoAI. |
|---|---|
| ISSN: | 2220-9964 |