A Style Transfer Method for Chinese Landscape Painting Based on Detail Feature Extraction and Fusion

The goal of Chinese painting image style transfer is to render a real landscape scene image with Chinese painting artistic features, guided by a style reference, while maintaining the original, realistic scene image content. Recently, due to the rapid development of deep learning, convolutional neur...

Full description

Saved in:
Bibliographic Details
Main Authors: Jinghao HU, Guohua GENG, Meijun XIONG, Siyi LI, Yuhe ZHANG
Format: Article
Language:English
Published: Editorial Department of Journal of Sichuan University (Engineering Science Edition) 2025-01-01
Series:工程科学与技术
Subjects:
Online Access:http://jsuese.scu.edu.cn/thesisDetails#10.12454/j.jsuese.202300295
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The goal of Chinese painting image style transfer is to render a real landscape scene image with Chinese painting artistic features, guided by a style reference, while maintaining the original, realistic scene image content. Recently, due to the rapid development of deep learning, convolutional neural networks (CNNs) and adversarial generative networks (GANs) have almost dominated image generation tasks, including style transfer. However, several uncontrollable problems persist, such as the loss of some semantics during the style transfer process, model collapse in GAN network training, and the checkerboard effect in CNN-based style transfer methods. The visual transformer model provides a new solution for image processing tasks, but its training requires a large amount of data and involves significant computational complexity. A Chinese landscape painting style transfer network, SSTR (swin style transfer transformer), is proposed based on the fusion of detailed feature extraction to address these issues and generate high-quality Chinese paintings. This approach introduces the Swin–Transformer within the StyTr<sup>2</sup> network framework and uses the visual transformer to preserve the features of landscapes. In addition, the layered architecture of the Swin–Transformer and the sliding window attention mechanism are utilized to extract finer details of the artistic features of landscape paintings while reducing the model’s training complexity. Finally, a CNN decoder is incorporated after the Swin–Transformer decoder to refine the resulting image. The public visual dataset COCO and a public landscape painting dataset are employed for training, validation, and testing, with the results compared to several baseline methods. The experimental findings demonstrated that SSTR outperforms StyTr<sup>2</sup> regarding style loss for the Chinese landscape painting style transfer task, showing superior feature extraction capabilities and image generation performance
ISSN:2096-3246