Weakly Supervised Semantic Segmentation of Remote Sensing Images Using Siamese Affinity Network

In recent years, weakly supervised semantic segmentation (WSSS) has garnered significant attention in remote sensing image analysis due to its low annotation cost. To address the issues of inaccurate and incomplete seed areas and unreliable pseudo masks in WSSS, we propose a novel WSSS method for re...

Full description

Saved in:
Bibliographic Details
Main Authors: Zheng Chen, Yuheng Lian, Jing Bai, Jingsen Zhang, Zhu Xiao, Biao Hou
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/5/808
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, weakly supervised semantic segmentation (WSSS) has garnered significant attention in remote sensing image analysis due to its low annotation cost. To address the issues of inaccurate and incomplete seed areas and unreliable pseudo masks in WSSS, we propose a novel WSSS method for remote sensing images based on the Siamese Affinity Network (SAN) and the Segment Anything Model (SAM). First, we design a seed enhancement module for semantic affinity, which strengthens contextual relevance in the feature map by enforcing a unified constraint principle of cross-pixel similarity, thereby capturing semantically similar regions within the image. Second, leveraging the prior notion of cross-view consistency, we employ a Siamese network to regularize the consistency of CAMs from different affine-transformed images, providing additional supervision for weakly supervised learning. Finally, we utilize the SAM segmentation model to generate semantic superpixels, expanding the original CAM seeds to more completely and accurately extract target edges, thereby improving the quality of segmentation pseudo masks. Experimental results on the large-scale remote sensing datasets DRLSD and ISPRS Vaihingen demonstrate that our method achieves segmentation performance close to that of fully supervised semantic segmentation (FSSS) methods on both datasets. Ablation studies further verify the positive optimization effect of each module on segmentation pseudo labels. Our approach exhibits superior localization accuracy and precise visualization effects across different backbone networks, achieving state-of-the-art localization performance.
ISSN:2072-4292