Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation

Image-level weakly supervised semantic segmentation is a challenging problem in computer vision and has gained a lot of attention in recent years. Most existing models utilize class activation mapping (CAM) to generate initial pseudo-labels for each image pixel. However, CAM usually focuses only on...

Full description

Saved in:
Bibliographic Details
Main Authors: Guanglun Huang, Zhaohao Zheng, Jun Li, Minghe Zhang, Jianming Liu, Li Zhang
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/12/6474
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850156993214414848
author Guanglun Huang
Zhaohao Zheng
Jun Li
Minghe Zhang
Jianming Liu
Li Zhang
author_facet Guanglun Huang
Zhaohao Zheng
Jun Li
Minghe Zhang
Jianming Liu
Li Zhang
author_sort Guanglun Huang
collection DOAJ
description Image-level weakly supervised semantic segmentation is a challenging problem in computer vision and has gained a lot of attention in recent years. Most existing models utilize class activation mapping (CAM) to generate initial pseudo-labels for each image pixel. However, CAM usually focuses only on the most discriminating regions of target objects and treats each channel feature map independently, which may overlook some important regions due to the lack of accurate pixel-level labels, leading to the underactivation of the target objects. In this paper, we propose a dual attention equivariant network (DAEN) model to address this problem by considering both channel and spatial information of different feature maps. Specifically, we first design a channel–spatial attention module (CSM) for DAEN to extract accurately features of target objects by considering the correlation among feature maps in different channels, and then integrate the CSM with equivariant regularization and pixel-correlation modules to achieve more accurate and effective pixel-level semantic segmentation. Extensive experimental results show that the DAEN model achieved 2.1% and 1.3% higher mIoU scores than the existing weakly supervised semantic segmentation models on the PASCAL VOC 2012 and LUAD-HistoSeg datasets, respectively, validating the effectiveness and efficiency of the DAEN model.
format Article
id doaj-art-8fd0190e0840418f9a8e2e3b1ade6891
institution OA Journals
issn 2076-3417
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-8fd0190e0840418f9a8e2e3b1ade68912025-08-20T02:24:18ZengMDPI AGApplied Sciences2076-34172025-06-011512647410.3390/app15126474Dual Attention Equivariant Network for Weakly Supervised Semantic SegmentationGuanglun Huang0Zhaohao Zheng1Jun Li2Minghe Zhang3Jianming Liu4Li Zhang5School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaImage-level weakly supervised semantic segmentation is a challenging problem in computer vision and has gained a lot of attention in recent years. Most existing models utilize class activation mapping (CAM) to generate initial pseudo-labels for each image pixel. However, CAM usually focuses only on the most discriminating regions of target objects and treats each channel feature map independently, which may overlook some important regions due to the lack of accurate pixel-level labels, leading to the underactivation of the target objects. In this paper, we propose a dual attention equivariant network (DAEN) model to address this problem by considering both channel and spatial information of different feature maps. Specifically, we first design a channel–spatial attention module (CSM) for DAEN to extract accurately features of target objects by considering the correlation among feature maps in different channels, and then integrate the CSM with equivariant regularization and pixel-correlation modules to achieve more accurate and effective pixel-level semantic segmentation. Extensive experimental results show that the DAEN model achieved 2.1% and 1.3% higher mIoU scores than the existing weakly supervised semantic segmentation models on the PASCAL VOC 2012 and LUAD-HistoSeg datasets, respectively, validating the effectiveness and efficiency of the DAEN model.https://www.mdpi.com/2076-3417/15/12/6474weakly supervised semantic segmentationCAMchannel attentionspatial attentionpixel correlation
spellingShingle Guanglun Huang
Zhaohao Zheng
Jun Li
Minghe Zhang
Jianming Liu
Li Zhang
Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation
Applied Sciences
weakly supervised semantic segmentation
CAM
channel attention
spatial attention
pixel correlation
title Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation
title_full Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation
title_fullStr Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation
title_full_unstemmed Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation
title_short Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation
title_sort dual attention equivariant network for weakly supervised semantic segmentation
topic weakly supervised semantic segmentation
CAM
channel attention
spatial attention
pixel correlation
url https://www.mdpi.com/2076-3417/15/12/6474
work_keys_str_mv AT guanglunhuang dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation
AT zhaohaozheng dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation
AT junli dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation
AT minghezhang dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation
AT jianmingliu dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation
AT lizhang dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation