Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance

Batik, a significant element of Indonesian cultural heritage, is renowned for its intricate patterns and profound philosophical meanings. While preserving traditional batik is crucial, the creation of modern patterns is equally encouraged to keep the art form vibrant and evolving. Current research p...

Full description

Saved in:
Bibliographic Details
Main Authors: Rahmatulloh Daffa Izzuddin Wahid, Novanto Yudistira, Candra Dewi, Irawati Nurmala Sari, Dyanningrum Pradhikta, Fatmawati
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10817602/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841556985426739200
author Rahmatulloh Daffa Izzuddin Wahid
Novanto Yudistira
Candra Dewi
Irawati Nurmala Sari
Dyanningrum Pradhikta
Fatmawati
author_facet Rahmatulloh Daffa Izzuddin Wahid
Novanto Yudistira
Candra Dewi
Irawati Nurmala Sari
Dyanningrum Pradhikta
Fatmawati
author_sort Rahmatulloh Daffa Izzuddin Wahid
collection DOAJ
description Batik, a significant element of Indonesian cultural heritage, is renowned for its intricate patterns and profound philosophical meanings. While preserving traditional batik is crucial, the creation of modern patterns is equally encouraged to keep the art form vibrant and evolving. Current research primarily focuses on batik classification, leaving a gap in the exploration of generative models for batik pattern creation. This paper investigates the application of text-to-image (T2I) generative models to synthesize batik motifs, leveraging latent diffusion models (LDM), Low-Rank Adaptation (LoRA), and classifier-free guidance. Our methodology employed a dataset of 20,000 batik images. Multimodal models such as LLaVA and BLIP were utilized to generate detailed captions for these images. A pretrained LDM was subsequently fine-tuned on its denoising U-Net part, either by naively fine-tuned the entire layer or by employing using LoRA. The fine-tuning process was critical in enhancing the model’s capability to generate high-quality and user-specific batik patterns. The results demonstrated that the LDM fine-tuned on the entire denoising U-Net with LLaVA-captioned images outperformed other models, achieving the lowest Fréchet Inception Distance (FID) and highest Inception Score (IS). The thoroughness of LLaVA captions proved superior to those generated by BLIP, emphasizing the significance of detailed image descriptions in generative tasks. Notably, the model not only replicated existing batik patterns but also innovatively combined multiple motifs and even able to create entirely new designs, as verified by batik expert. This research contributes to the field of computer-assisted batik pattern generation, providing significant advantages for batik artists, manufacturers, and users by accelerating the pattern creation process and expanding the possibilities of batik art.
format Article
id doaj-art-0464496d096d4f999ab7c44190bf60aa
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-0464496d096d4f999ab7c44190bf60aa2025-01-07T00:02:35ZengIEEEIEEE Access2169-35362025-01-01132436244810.1109/ACCESS.2024.352349410817602Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free GuidanceRahmatulloh Daffa Izzuddin Wahid0https://orcid.org/0009-0004-2572-8601Novanto Yudistira1https://orcid.org/0000-0001-5330-5930Candra Dewi2https://orcid.org/0000-0003-0739-4148Irawati Nurmala Sari3Dyanningrum Pradhikta4 Fatmawati5Program Studi Teknik Informatika, Departemen Teknik Informatika, Fakultas Ilmu Komputer, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Teknik Informatika, Departemen Teknik Informatika, Fakultas Ilmu Komputer, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Teknik Informatika, Departemen Teknik Informatika, Fakultas Ilmu Komputer, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Teknik Informatika, Departemen Teknik Informatika, Fakultas Ilmu Komputer, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Seni Rupa Murni, Jurusan Seni dan Antropologi Budaya, Fakultas Ilmu Budaya, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Seni Rupa Murni, Jurusan Seni dan Antropologi Budaya, Fakultas Ilmu Budaya, Universitas Brawijaya, Malang, East Java, IndonesiaBatik, a significant element of Indonesian cultural heritage, is renowned for its intricate patterns and profound philosophical meanings. While preserving traditional batik is crucial, the creation of modern patterns is equally encouraged to keep the art form vibrant and evolving. Current research primarily focuses on batik classification, leaving a gap in the exploration of generative models for batik pattern creation. This paper investigates the application of text-to-image (T2I) generative models to synthesize batik motifs, leveraging latent diffusion models (LDM), Low-Rank Adaptation (LoRA), and classifier-free guidance. Our methodology employed a dataset of 20,000 batik images. Multimodal models such as LLaVA and BLIP were utilized to generate detailed captions for these images. A pretrained LDM was subsequently fine-tuned on its denoising U-Net part, either by naively fine-tuned the entire layer or by employing using LoRA. The fine-tuning process was critical in enhancing the model’s capability to generate high-quality and user-specific batik patterns. The results demonstrated that the LDM fine-tuned on the entire denoising U-Net with LLaVA-captioned images outperformed other models, achieving the lowest Fréchet Inception Distance (FID) and highest Inception Score (IS). The thoroughness of LLaVA captions proved superior to those generated by BLIP, emphasizing the significance of detailed image descriptions in generative tasks. Notably, the model not only replicated existing batik patterns but also innovatively combined multiple motifs and even able to create entirely new designs, as verified by batik expert. This research contributes to the field of computer-assisted batik pattern generation, providing significant advantages for batik artists, manufacturers, and users by accelerating the pattern creation process and expanding the possibilities of batik art.https://ieeexplore.ieee.org/document/10817602/Batikdiffusion modelimage generationcaption generation
spellingShingle Rahmatulloh Daffa Izzuddin Wahid
Novanto Yudistira
Candra Dewi
Irawati Nurmala Sari
Dyanningrum Pradhikta
Fatmawati
Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance
IEEE Access
Batik
diffusion model
image generation
caption generation
title Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance
title_full Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance
title_fullStr Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance
title_full_unstemmed Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance
title_short Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance
title_sort prompt conditioned batik pattern generation using lora weighted diffusion model with classifier free guidance
topic Batik
diffusion model
image generation
caption generation
url https://ieeexplore.ieee.org/document/10817602/
work_keys_str_mv AT rahmatullohdaffaizzuddinwahid promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance
AT novantoyudistira promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance
AT candradewi promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance
AT irawatinurmalasari promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance
AT dyanningrumpradhikta promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance
AT fatmawati promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance