Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation
Image-level weakly supervised semantic segmentation is a challenging problem in computer vision and has gained a lot of attention in recent years. Most existing models utilize class activation mapping (CAM) to generate initial pseudo-labels for each image pixel. However, CAM usually focuses only on...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/12/6474 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850156993214414848 |
|---|---|
| author | Guanglun Huang Zhaohao Zheng Jun Li Minghe Zhang Jianming Liu Li Zhang |
| author_facet | Guanglun Huang Zhaohao Zheng Jun Li Minghe Zhang Jianming Liu Li Zhang |
| author_sort | Guanglun Huang |
| collection | DOAJ |
| description | Image-level weakly supervised semantic segmentation is a challenging problem in computer vision and has gained a lot of attention in recent years. Most existing models utilize class activation mapping (CAM) to generate initial pseudo-labels for each image pixel. However, CAM usually focuses only on the most discriminating regions of target objects and treats each channel feature map independently, which may overlook some important regions due to the lack of accurate pixel-level labels, leading to the underactivation of the target objects. In this paper, we propose a dual attention equivariant network (DAEN) model to address this problem by considering both channel and spatial information of different feature maps. Specifically, we first design a channel–spatial attention module (CSM) for DAEN to extract accurately features of target objects by considering the correlation among feature maps in different channels, and then integrate the CSM with equivariant regularization and pixel-correlation modules to achieve more accurate and effective pixel-level semantic segmentation. Extensive experimental results show that the DAEN model achieved 2.1% and 1.3% higher mIoU scores than the existing weakly supervised semantic segmentation models on the PASCAL VOC 2012 and LUAD-HistoSeg datasets, respectively, validating the effectiveness and efficiency of the DAEN model. |
| format | Article |
| id | doaj-art-8fd0190e0840418f9a8e2e3b1ade6891 |
| institution | OA Journals |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-8fd0190e0840418f9a8e2e3b1ade68912025-08-20T02:24:18ZengMDPI AGApplied Sciences2076-34172025-06-011512647410.3390/app15126474Dual Attention Equivariant Network for Weakly Supervised Semantic SegmentationGuanglun Huang0Zhaohao Zheng1Jun Li2Minghe Zhang3Jianming Liu4Li Zhang5School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, ChinaImage-level weakly supervised semantic segmentation is a challenging problem in computer vision and has gained a lot of attention in recent years. Most existing models utilize class activation mapping (CAM) to generate initial pseudo-labels for each image pixel. However, CAM usually focuses only on the most discriminating regions of target objects and treats each channel feature map independently, which may overlook some important regions due to the lack of accurate pixel-level labels, leading to the underactivation of the target objects. In this paper, we propose a dual attention equivariant network (DAEN) model to address this problem by considering both channel and spatial information of different feature maps. Specifically, we first design a channel–spatial attention module (CSM) for DAEN to extract accurately features of target objects by considering the correlation among feature maps in different channels, and then integrate the CSM with equivariant regularization and pixel-correlation modules to achieve more accurate and effective pixel-level semantic segmentation. Extensive experimental results show that the DAEN model achieved 2.1% and 1.3% higher mIoU scores than the existing weakly supervised semantic segmentation models on the PASCAL VOC 2012 and LUAD-HistoSeg datasets, respectively, validating the effectiveness and efficiency of the DAEN model.https://www.mdpi.com/2076-3417/15/12/6474weakly supervised semantic segmentationCAMchannel attentionspatial attentionpixel correlation |
| spellingShingle | Guanglun Huang Zhaohao Zheng Jun Li Minghe Zhang Jianming Liu Li Zhang Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation Applied Sciences weakly supervised semantic segmentation CAM channel attention spatial attention pixel correlation |
| title | Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation |
| title_full | Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation |
| title_fullStr | Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation |
| title_full_unstemmed | Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation |
| title_short | Dual Attention Equivariant Network for Weakly Supervised Semantic Segmentation |
| title_sort | dual attention equivariant network for weakly supervised semantic segmentation |
| topic | weakly supervised semantic segmentation CAM channel attention spatial attention pixel correlation |
| url | https://www.mdpi.com/2076-3417/15/12/6474 |
| work_keys_str_mv | AT guanglunhuang dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation AT zhaohaozheng dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation AT junli dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation AT minghezhang dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation AT jianmingliu dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation AT lizhang dualattentionequivariantnetworkforweaklysupervisedsemanticsegmentation |