Morphological-Priors-Guided Network With Semantic Booster and Scalable Bins Module for Height Estimation From Single-View Remote Sensing Images

Geographic height information describes the vertical spatial structure of the city and serves as important foundational data for urban management. Obtaining height information from single-view remote sensing images is a relatively low-cost and convenient approach. However, there exist several bottle...

Full description

Saved in:
Bibliographic Details
Main Authors: Tao Zhang, Furong Shi, Yuanping Zhu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11048889/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Geographic height information describes the vertical spatial structure of the city and serves as important foundational data for urban management. Obtaining height information from single-view remote sensing images is a relatively low-cost and convenient approach. However, there exist several bottlenecks in the current methods for inferring height information from monocular remote sensing images, such as difficulties in learning 3-D semantic information and accurately fitting the height morphology in various local scenes. In this study, we address these challenges by proposing a morphological-priors-guided network, termed MPG-Net, for accurate height estimation from single-view remote sensing images. First, considering the semantic morphological priors, we propose to explicitly enhance the 3-D visual cues (e.g., co-occurrence relationship between shadow buildings and shadow trees) and simultaneously design a semantic booster composed of a two-stream network with a multilevel cross-stream attention fusion mechanism to facilitate the 3-D feature learning for monocular height estimation. Second, taking into account the height distribution priors, we propose a scalable bins module that can create fully adaptive bins within a flexible height range for each input image, leading a more accurate delineation of height distribution pattern. The proposed MPG-Net is comprehensively evaluated on two datasets of different scenes (i.e., ISPRS Vaihingen and Potsdam datasets). Results indicate that the proposed MPG-Net significantly outperforms the existing methods, with the lowest root-mean-square error of 1.613 m and 1.947 m on Vaihingen and Potsdam, respectively. Furthermore, extensive ablation studies demonstrate the contribution of each designed component in the proposed method.
ISSN:1939-1404
2151-1535