Synergistic Semantic Segmentation and Height Estimation for Monocular Remote Sensing Images via Cross-Task Interaction

Semantic segmentation and height estimation in remote sensing imagery are two pivotal tasks for scene understanding, and they are highly interrelated. Although deep learning methods have achieved remarkable progress in these tasks in recent years, several challenges remain. Recent studies have shown...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xuanang Peng, Shixin Wang, Futao Wang, Jinfeng Zhu, Suju Li, Longfei Liu, Zhenqing Wang
Format:	Article
Language:	English
Published:	MDPI AG 2025-05-01
Series:	Remote Sensing
Subjects:	remote sensing imagery semantic segmentation height estimation multi-task learning
Online Access:	https://www.mdpi.com/2072-4292/17/9/1637
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Semantic segmentation and height estimation in remote sensing imagery are two pivotal tasks for scene understanding, and they are highly interrelated. Although deep learning methods have achieved remarkable progress in these tasks in recent years, several challenges remain. Recent studies have shown that multi-task learning methods can enhance the complementarity of task-related features, thus maximizing the prediction accuracy of multiple tasks at a low computational cost. However, due to factors such as complex semantic categories and the inconsistent spatial scales of remotely sensed images, existing multi-task learning methods often fail to achieve better results on these two tasks. To address this issue, we propose CTME-Net, a novel architecture termed the Cross-Task Mutual Enhancement Network, designed to jointly perform height estimation and semantic segmentation tasks on remote sensing imagery. Firstly, to generate discriminative initial features for each task branch and activate dedicated pathways for cross-task feature disentanglement, we design a universal initial feature embedding module for each downstream task. Secondly, to address the impact of redundancy in general features during global–local fusion, we develop an Adaptive Task-specific Feature Distillation Module that enhances the model’s ability to acquire task-specific features. Finally, we propose a task feature interaction module to optimize features across tasks through mutual optimization, maximizing task-specific feature expression. We conduct extensive experiments on the ISPRS Vaihingen and Potsdam datasets to validate the effectiveness of our approach. The results demonstrate that our proposed method outperforms existing methods in both height estimation and semantic segmentation.
ISSN:	2072-4292

Synergistic Semantic Segmentation and Height Estimation for Monocular Remote Sensing Images via Cross-Task Interaction

Similar Items