LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education
Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generativ...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-12-01
|
| Series: | Computers and Education: Artificial Intelligence |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2666920X24001000 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850250287034400768 |
|---|---|
| author | Unggi Lee Minji Jeon Yunseo Lee Gyuri Byun Yoorim Son Jaeyoon Shin Hongkyu Ko Hyeoncheol Kim |
| author_facet | Unggi Lee Minji Jeon Yunseo Lee Gyuri Byun Yoorim Son Jaeyoon Shin Hongkyu Ko Hyeoncheol Kim |
| author_sort | Unggi Lee |
| collection | DOAJ |
| description | Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced. |
| format | Article |
| id | doaj-art-22018e8d9dae4ac5a8380dcce5f3586c |
| institution | OA Journals |
| issn | 2666-920X |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Computers and Education: Artificial Intelligence |
| spelling | doaj-art-22018e8d9dae4ac5a8380dcce5f3586c2025-08-20T01:58:16ZengElsevierComputers and Education: Artificial Intelligence2666-920X2024-12-01710029710.1016/j.caeai.2024.100297LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation educationUnggi Lee0Minji Jeon1Yunseo Lee2Gyuri Byun3Yoorim Son4Jaeyoon Shin5Hongkyu Ko6Hyeoncheol Kim7Department of Computer Science and Engineering, Korea University, South Korea; Corresponding author. Department of Computer Science & Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul, South Korea.Teaching, Learning and Teacher Education, University of Nebraska-Lincoln, United StatesPoongnap Elementary School, Seoul Metropolitan Office of Education, South KoreaDepartment of Education, Seoul National University, South KoreaInterdisciplinary Program in Art Education (Art Education Major), Seoul National University, South KoreaDepartment of Elementary Art Education, Seoul National University of Education, South KoreaDepartment of Elementary Art Education, Seoul National University of Education, South Korea; Corresponding author.Department of Computer Science and Engineering, Korea University, South Korea; Corresponding author.Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.http://www.sciencedirect.com/science/article/pii/S2666920X24001000Art appreciation educationMultimodal large language modelInstruction tuning |
| spellingShingle | Unggi Lee Minji Jeon Yunseo Lee Gyuri Byun Yoorim Son Jaeyoon Shin Hongkyu Ko Hyeoncheol Kim LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education Computers and Education: Artificial Intelligence Art appreciation education Multimodal large language model Instruction tuning |
| title | LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education |
| title_full | LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education |
| title_fullStr | LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education |
| title_full_unstemmed | LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education |
| title_short | LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education |
| title_sort | llava docent instruction tuning with multimodal large language model to support art appreciation education |
| topic | Art appreciation education Multimodal large language model Instruction tuning |
| url | http://www.sciencedirect.com/science/article/pii/S2666920X24001000 |
| work_keys_str_mv | AT unggilee llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation AT minjijeon llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation AT yunseolee llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation AT gyuribyun llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation AT yoorimson llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation AT jaeyoonshin llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation AT hongkyuko llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation AT hyeoncheolkim llavadocentinstructiontuningwithmultimodallargelanguagemodeltosupportartappreciationeducation |