LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education

Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generativ...

Full description

Saved in:
Bibliographic Details
Main Authors: Unggi Lee, Minji Jeon, Yunseo Lee, Gyuri Byun, Yoorim Son, Jaeyoon Shin, Hongkyu Ko, Hyeoncheol Kim
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:Computers and Education: Artificial Intelligence
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666920X24001000
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850250287034400768
author Unggi Lee
Minji Jeon
Yunseo Lee
Gyuri Byun
Yoorim Son
Jaeyoon Shin
Hongkyu Ko
Hyeoncheol Kim
author_facet Unggi Lee
Minji Jeon
Yunseo Lee
Gyuri Byun
Yoorim Son
Jaeyoon Shin
Hongkyu Ko
Hyeoncheol Kim
author_sort Unggi Lee
collection DOAJ
description Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.
format Article
id doaj-art-22018e8d9dae4ac5a8380dcce5f3586c
institution OA Journals
issn 2666-920X
language English
publishDate 2024-12-01
publisher Elsevier
record_format Article
series Computers and Education: Artificial Intelligence
spelling doaj-art-22018e8d9dae4ac5a8380dcce5f3586c2025-08-20T01:58:16ZengElsevierComputers and Education: Artificial Intelligence2666-920X2024-12-01710029710.1016/j.caeai.2024.100297LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation educationUnggi Lee0Minji Jeon1Yunseo Lee2Gyuri Byun3Yoorim Son4Jaeyoon Shin5Hongkyu Ko6Hyeoncheol Kim7Department of Computer Science and Engineering, Korea University, South Korea; Corresponding author. Department of Computer Science & Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul, South Korea.Teaching, Learning and Teacher Education, University of Nebraska-Lincoln, United StatesPoongnap Elementary School, Seoul Metropolitan Office of Education, South KoreaDepartment of Education, Seoul National University, South KoreaInterdisciplinary Program in Art Education (Art Education Major), Seoul National University, South KoreaDepartment of Elementary Art Education, Seoul National University of Education, South KoreaDepartment of Elementary Art Education, Seoul National University of Education, South Korea; Corresponding author.Department of Computer Science and Engineering, Korea University, South Korea; Corresponding author.Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.http://www.sciencedirect.com/science/article/pii/S2666920X24001000Art appreciation educationMultimodal large language modelInstruction tuning
spellingShingle Unggi Lee
Minji Jeon
Yunseo Lee
Gyuri Byun
Yoorim Son
Jaeyoon Shin
Hongkyu Ko
Hyeoncheol Kim
LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education
Computers and Education: Artificial Intelligence
Art appreciation education
Multimodal large language model
Instruction tuning
title LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education
title_full LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education
title_fullStr LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education
title_full_unstemmed LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education
title_short LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education
title_sort llava docent instruction tuning with multimodal large language model to support art appreciation education
topic Art appreciation education
Multimodal large language model
Instruction tuning
url http://www.sciencedirect.com/science/article/pii/S2666920X24001000
work_keys_str_mv AT unggilee llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation
AT minjijeon llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation
AT yunseolee llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation
AT gyuribyun llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation
AT yoorimson llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation
AT jaeyoonshin llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation
AT hongkyuko llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation
AT hyeoncheolkim llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation