Morphological-Priors-Guided Network With Semantic Booster and Scalable Bins Module for Height Estimation From Single-View Remote Sensing Images
Geographic height information describes the vertical spatial structure of the city and serves as important foundational data for urban management. Obtaining height information from single-view remote sensing images is a relatively low-cost and convenient approach. However, there exist several bottle...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11048889/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Geographic height information describes the vertical spatial structure of the city and serves as important foundational data for urban management. Obtaining height information from single-view remote sensing images is a relatively low-cost and convenient approach. However, there exist several bottlenecks in the current methods for inferring height information from monocular remote sensing images, such as difficulties in learning 3-D semantic information and accurately fitting the height morphology in various local scenes. In this study, we address these challenges by proposing a morphological-priors-guided network, termed MPG-Net, for accurate height estimation from single-view remote sensing images. First, considering the semantic morphological priors, we propose to explicitly enhance the 3-D visual cues (e.g., co-occurrence relationship between shadow buildings and shadow trees) and simultaneously design a semantic booster composed of a two-stream network with a multilevel cross-stream attention fusion mechanism to facilitate the 3-D feature learning for monocular height estimation. Second, taking into account the height distribution priors, we propose a scalable bins module that can create fully adaptive bins within a flexible height range for each input image, leading a more accurate delineation of height distribution pattern. The proposed MPG-Net is comprehensively evaluated on two datasets of different scenes (i.e., ISPRS Vaihingen and Potsdam datasets). Results indicate that the proposed MPG-Net significantly outperforms the existing methods, with the lowest root-mean-square error of 1.613 m and 1.947 m on Vaihingen and Potsdam, respectively. Furthermore, extensive ablation studies demonstrate the contribution of each designed component in the proposed method. |
|---|---|
| ISSN: | 1939-1404 2151-1535 |