SP-IGAN: An Improved GAN Framework for Effective Utilization of Semantic Priors in Real-World Image Super-Resolution
Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual info...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Entropy |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1099-4300/27/4/414 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the acquisition of high-frequency details in model design. To address this issue, we propose the Semantic Prior-Improved GAN (SP-IGAN) framework, which incorporates additional contextual semantic information into the Real-ESRGAN model. The framework consists of two branches. The main branch introduces a Graph Convolutional Channel Attention (GCCA) module to transform channel dependencies into adjacency relationships between feature vertices, thereby enhancing pixel associations. The auxiliary branch strengthens the correlation between semantic category information and regional textures in the Residual-in-Residual Dense Block (RRDB) module. The auxiliary branch employs a pretrained segmentation model to accurately extract regional semantic information from the input low-resolution image. This information is injected into the RRDB module through Spatial Feature Transform (SFT) layers, generating more accurate and semantically consistent texture details. Additionally, a wavelet loss is incorporated into the loss function to capture high-frequency details that are often overlooked. The experimental results demonstrate that the proposed SP-IGAN outperforms state-of-the-art (SOTA) super-resolution models across multiple public datasets. For the X4 super-resolution task, SP-IGAN achieves a 0.55 dB improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.0363 increase in Structural Similarity Index (SSIM) compared to the baseline model Real-ESRGAN. |
|---|---|
| ISSN: | 1099-4300 |