Sliding-Window Dissimilarity Cross-Attention for Near-Real-Time Building Change Detection

A near-real-time change detection network can consistently identify unauthorized construction activities over a wide area, empowering authorities to enforce regulations efficiently. Furthermore, it can promptly assess building damage, enabling expedited rescue efforts. The extensive adoption of deep...

Full description

Saved in:
Bibliographic Details
Main Authors: Wen Lu, Minh Nguyen
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/1/135
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841548979161006080
author Wen Lu
Minh Nguyen
author_facet Wen Lu
Minh Nguyen
author_sort Wen Lu
collection DOAJ
description A near-real-time change detection network can consistently identify unauthorized construction activities over a wide area, empowering authorities to enforce regulations efficiently. Furthermore, it can promptly assess building damage, enabling expedited rescue efforts. The extensive adoption of deep learning in change detection has prompted a predominant emphasis on enhancing detection performance, primarily through the expansion of the depth and width of networks, overlooking considerations regarding inference time and computational cost. To accurately represent the spatio-temporal semantic correlations between pre-change and post-change images, we create an innovative transformer attention mechanism named Sliding-Window Dissimilarity Cross-Attention (SWDCA), which detects spatio-temporal semantic discrepancies by explicitly modeling the dissimilarity of bi-temporal tokens, departing from the mono-temporal similarity attention typically used in conventional transformers. In order to fulfill the near-real-time requirement, SWDCA employs a sliding-window scheme to limit the range of the cross-attention mechanism within a predetermined window/dilated window size. This approach not only excludes distant and irrelevant information but also reduces computational cost. Furthermore, we develop a lightweight Siamese backbone for extracting building and environmental features. Subsequently, we integrate an SWDCA module into this backbone, forming an efficient change detection network. Quantitative evaluations and visual analyses of thorough experiments verify that our method achieves top-tier accuracy on two building change detection datasets of remote sensing imagery, while also achieving a real-time inference speed of 33.2 FPS on a mobile GPU.
format Article
id doaj-art-2d44364bdf124a1083165f70d228fbce
institution Kabale University
issn 2072-4292
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-2d44364bdf124a1083165f70d228fbce2025-01-10T13:20:21ZengMDPI AGRemote Sensing2072-42922025-01-0117113510.3390/rs17010135Sliding-Window Dissimilarity Cross-Attention for Near-Real-Time Building Change DetectionWen Lu0Minh Nguyen1School of Engineering, Computer & Mathematical Sciences, Auckland University of Technology, Auckland 1010, New ZealandSchool of Engineering, Computer & Mathematical Sciences, Auckland University of Technology, Auckland 1010, New ZealandA near-real-time change detection network can consistently identify unauthorized construction activities over a wide area, empowering authorities to enforce regulations efficiently. Furthermore, it can promptly assess building damage, enabling expedited rescue efforts. The extensive adoption of deep learning in change detection has prompted a predominant emphasis on enhancing detection performance, primarily through the expansion of the depth and width of networks, overlooking considerations regarding inference time and computational cost. To accurately represent the spatio-temporal semantic correlations between pre-change and post-change images, we create an innovative transformer attention mechanism named Sliding-Window Dissimilarity Cross-Attention (SWDCA), which detects spatio-temporal semantic discrepancies by explicitly modeling the dissimilarity of bi-temporal tokens, departing from the mono-temporal similarity attention typically used in conventional transformers. In order to fulfill the near-real-time requirement, SWDCA employs a sliding-window scheme to limit the range of the cross-attention mechanism within a predetermined window/dilated window size. This approach not only excludes distant and irrelevant information but also reduces computational cost. Furthermore, we develop a lightweight Siamese backbone for extracting building and environmental features. Subsequently, we integrate an SWDCA module into this backbone, forming an efficient change detection network. Quantitative evaluations and visual analyses of thorough experiments verify that our method achieves top-tier accuracy on two building change detection datasets of remote sensing imagery, while also achieving a real-time inference speed of 33.2 FPS on a mobile GPU.https://www.mdpi.com/2072-4292/17/1/135remote sensingbuilding change detectiontransformercross-attention
spellingShingle Wen Lu
Minh Nguyen
Sliding-Window Dissimilarity Cross-Attention for Near-Real-Time Building Change Detection
Remote Sensing
remote sensing
building change detection
transformer
cross-attention
title Sliding-Window Dissimilarity Cross-Attention for Near-Real-Time Building Change Detection
title_full Sliding-Window Dissimilarity Cross-Attention for Near-Real-Time Building Change Detection
title_fullStr Sliding-Window Dissimilarity Cross-Attention for Near-Real-Time Building Change Detection
title_full_unstemmed Sliding-Window Dissimilarity Cross-Attention for Near-Real-Time Building Change Detection
title_short Sliding-Window Dissimilarity Cross-Attention for Near-Real-Time Building Change Detection
title_sort sliding window dissimilarity cross attention for near real time building change detection
topic remote sensing
building change detection
transformer
cross-attention
url https://www.mdpi.com/2072-4292/17/1/135
work_keys_str_mv AT wenlu slidingwindowdissimilaritycrossattentionfornearrealtimebuildingchangedetection
AT minhnguyen slidingwindowdissimilaritycrossattentionfornearrealtimebuildingchangedetection