Spatial-Temporal Semantic Feature Interaction Network for Semantic Change Detection in Remote Sensing Images

Semantic Change Detection (SCD) in Remote Sensing Images (RSI) aims to identify changes in the type of Land Cover/Land Use (LCLU). The “from-to” information of the acquired image has more profound practical significance than Binary Change Detection (BCD). However, most deep lea...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuhang Zhang, Wuxia Zhang, Songtao Ding, Siyuan Wu, Xiaoqiang Lu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10979855/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Semantic Change Detection (SCD) in Remote Sensing Images (RSI) aims to identify changes in the type of Land Cover/Land Use (LCLU). The “from-to” information of the acquired image has more profound practical significance than Binary Change Detection (BCD). However, most deep learning-based SCD algorithms do not fully exploit the spatial-temporal information of multilevel features, leading to challenges in extracting LCLU features in complex scenes. To address these issues, we propose a Spatial-Temporal Semantic Feature Interaction Network (STS-FINet) to improve the performance of SCD in RSI. The proposed STS-FINet comprises a Multi-Scale Feature Extraction Encoder (MS-FEE), a Transformer-based Multilevel Feature Interaction module (TML-FI), and a Multilevel Feature Fusion Decoder (ML-FFD). The MS-FEE extracts deep semantic and differential information from the RSI. The TML-FI is designed to mine the spatial-temporal information by extracting long-range dependencies and spatial information from multilevel features to improve spatial perception. Moreover, Mixed Spatial Reasoning Convolution block (MixSrc) is presented to enrich the spatial information by extracting the multiscale features, thus improving the model's capability to interpret complex scenes. Finally, ML-FFD integrates the multilevel features, resulting in the generation of the semantic change map. The effectiveness of the proposed STS-FINet is verified on two high-resolution RSI datasets. Experimental results show that the proposed STS-FINet achieves better change detection performance than SOTA methods.
ISSN:1939-1404
2151-1535