Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance
Batik, a significant element of Indonesian cultural heritage, is renowned for its intricate patterns and profound philosophical meanings. While preserving traditional batik is crucial, the creation of modern patterns is equally encouraged to keep the art form vibrant and evolving. Current research p...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10817602/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841556985426739200 |
---|---|
author | Rahmatulloh Daffa Izzuddin Wahid Novanto Yudistira Candra Dewi Irawati Nurmala Sari Dyanningrum Pradhikta Fatmawati |
author_facet | Rahmatulloh Daffa Izzuddin Wahid Novanto Yudistira Candra Dewi Irawati Nurmala Sari Dyanningrum Pradhikta Fatmawati |
author_sort | Rahmatulloh Daffa Izzuddin Wahid |
collection | DOAJ |
description | Batik, a significant element of Indonesian cultural heritage, is renowned for its intricate patterns and profound philosophical meanings. While preserving traditional batik is crucial, the creation of modern patterns is equally encouraged to keep the art form vibrant and evolving. Current research primarily focuses on batik classification, leaving a gap in the exploration of generative models for batik pattern creation. This paper investigates the application of text-to-image (T2I) generative models to synthesize batik motifs, leveraging latent diffusion models (LDM), Low-Rank Adaptation (LoRA), and classifier-free guidance. Our methodology employed a dataset of 20,000 batik images. Multimodal models such as LLaVA and BLIP were utilized to generate detailed captions for these images. A pretrained LDM was subsequently fine-tuned on its denoising U-Net part, either by naively fine-tuned the entire layer or by employing using LoRA. The fine-tuning process was critical in enhancing the model’s capability to generate high-quality and user-specific batik patterns. The results demonstrated that the LDM fine-tuned on the entire denoising U-Net with LLaVA-captioned images outperformed other models, achieving the lowest Fréchet Inception Distance (FID) and highest Inception Score (IS). The thoroughness of LLaVA captions proved superior to those generated by BLIP, emphasizing the significance of detailed image descriptions in generative tasks. Notably, the model not only replicated existing batik patterns but also innovatively combined multiple motifs and even able to create entirely new designs, as verified by batik expert. This research contributes to the field of computer-assisted batik pattern generation, providing significant advantages for batik artists, manufacturers, and users by accelerating the pattern creation process and expanding the possibilities of batik art. |
format | Article |
id | doaj-art-0464496d096d4f999ab7c44190bf60aa |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-0464496d096d4f999ab7c44190bf60aa2025-01-07T00:02:35ZengIEEEIEEE Access2169-35362025-01-01132436244810.1109/ACCESS.2024.352349410817602Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free GuidanceRahmatulloh Daffa Izzuddin Wahid0https://orcid.org/0009-0004-2572-8601Novanto Yudistira1https://orcid.org/0000-0001-5330-5930Candra Dewi2https://orcid.org/0000-0003-0739-4148Irawati Nurmala Sari3Dyanningrum Pradhikta4 Fatmawati5Program Studi Teknik Informatika, Departemen Teknik Informatika, Fakultas Ilmu Komputer, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Teknik Informatika, Departemen Teknik Informatika, Fakultas Ilmu Komputer, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Teknik Informatika, Departemen Teknik Informatika, Fakultas Ilmu Komputer, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Teknik Informatika, Departemen Teknik Informatika, Fakultas Ilmu Komputer, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Seni Rupa Murni, Jurusan Seni dan Antropologi Budaya, Fakultas Ilmu Budaya, Universitas Brawijaya, Malang, East Java, IndonesiaProgram Studi Seni Rupa Murni, Jurusan Seni dan Antropologi Budaya, Fakultas Ilmu Budaya, Universitas Brawijaya, Malang, East Java, IndonesiaBatik, a significant element of Indonesian cultural heritage, is renowned for its intricate patterns and profound philosophical meanings. While preserving traditional batik is crucial, the creation of modern patterns is equally encouraged to keep the art form vibrant and evolving. Current research primarily focuses on batik classification, leaving a gap in the exploration of generative models for batik pattern creation. This paper investigates the application of text-to-image (T2I) generative models to synthesize batik motifs, leveraging latent diffusion models (LDM), Low-Rank Adaptation (LoRA), and classifier-free guidance. Our methodology employed a dataset of 20,000 batik images. Multimodal models such as LLaVA and BLIP were utilized to generate detailed captions for these images. A pretrained LDM was subsequently fine-tuned on its denoising U-Net part, either by naively fine-tuned the entire layer or by employing using LoRA. The fine-tuning process was critical in enhancing the model’s capability to generate high-quality and user-specific batik patterns. The results demonstrated that the LDM fine-tuned on the entire denoising U-Net with LLaVA-captioned images outperformed other models, achieving the lowest Fréchet Inception Distance (FID) and highest Inception Score (IS). The thoroughness of LLaVA captions proved superior to those generated by BLIP, emphasizing the significance of detailed image descriptions in generative tasks. Notably, the model not only replicated existing batik patterns but also innovatively combined multiple motifs and even able to create entirely new designs, as verified by batik expert. This research contributes to the field of computer-assisted batik pattern generation, providing significant advantages for batik artists, manufacturers, and users by accelerating the pattern creation process and expanding the possibilities of batik art.https://ieeexplore.ieee.org/document/10817602/Batikdiffusion modelimage generationcaption generation |
spellingShingle | Rahmatulloh Daffa Izzuddin Wahid Novanto Yudistira Candra Dewi Irawati Nurmala Sari Dyanningrum Pradhikta Fatmawati Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance IEEE Access Batik diffusion model image generation caption generation |
title | Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance |
title_full | Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance |
title_fullStr | Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance |
title_full_unstemmed | Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance |
title_short | Prompt Conditioned Batik Pattern Generation Using LoRA Weighted Diffusion Model With Classifier-Free Guidance |
title_sort | prompt conditioned batik pattern generation using lora weighted diffusion model with classifier free guidance |
topic | Batik diffusion model image generation caption generation |
url | https://ieeexplore.ieee.org/document/10817602/ |
work_keys_str_mv | AT rahmatullohdaffaizzuddinwahid promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance AT novantoyudistira promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance AT candradewi promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance AT irawatinurmalasari promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance AT dyanningrumpradhikta promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance AT fatmawati promptconditionedbatikpatterngenerationusingloraweighteddiffusionmodelwithclassifierfreeguidance |