SP-IGAN: An Improved GAN Framework for Effective Utilization of Semantic Priors in Real-World Image Super-Resolution

Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual info...

Full description

Saved in:
Bibliographic Details
Main Authors: Meng Wang, Zhengnan Li, Haipeng Liu, Zhaoyu Chen, Kewei Cai
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/27/4/414
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Single-image super-resolution (SISR) based on GANs has achieved significant progress. However, these methods still face challenges when reconstructing locally consistent textures due to a lack of semantic understanding of image categories. This highlights the necessity of focusing on contextual information comprehension and the acquisition of high-frequency details in model design. To address this issue, we propose the Semantic Prior-Improved GAN (SP-IGAN) framework, which incorporates additional contextual semantic information into the Real-ESRGAN model. The framework consists of two branches. The main branch introduces a Graph Convolutional Channel Attention (GCCA) module to transform channel dependencies into adjacency relationships between feature vertices, thereby enhancing pixel associations. The auxiliary branch strengthens the correlation between semantic category information and regional textures in the Residual-in-Residual Dense Block (RRDB) module. The auxiliary branch employs a pretrained segmentation model to accurately extract regional semantic information from the input low-resolution image. This information is injected into the RRDB module through Spatial Feature Transform (SFT) layers, generating more accurate and semantically consistent texture details. Additionally, a wavelet loss is incorporated into the loss function to capture high-frequency details that are often overlooked. The experimental results demonstrate that the proposed SP-IGAN outperforms state-of-the-art (SOTA) super-resolution models across multiple public datasets. For the X4 super-resolution task, SP-IGAN achieves a 0.55 dB improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.0363 increase in Structural Similarity Index (SSIM) compared to the baseline model Real-ESRGAN.
ISSN:1099-4300