LLM-Based Response Generation for Korean Adolescents: A Study Using the NAVER Knowledge iN Q&A Dataset with RAG

Objectives This research aimed to develop a retrieval-augmented generation (RAG) based large language model (LLM) system that offers personalized and reliable responses to a wide range of concerns raised by Korean adolescents. Our work focuses on building a culturally reflective dataset and on desig...

Full description

Saved in:
Bibliographic Details
Main Authors: Junseo Kim, Seok Jun Kim, Junseok Ahn, Suehyun Lee
Format: Article
Language:English
Published: The Korean Society of Medical Informatics 2025-04-01
Series:Healthcare Informatics Research
Subjects:
Online Access:http://e-hir.org/upload/pdf/hir-2025-31-2-136.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Objectives This research aimed to develop a retrieval-augmented generation (RAG) based large language model (LLM) system that offers personalized and reliable responses to a wide range of concerns raised by Korean adolescents. Our work focuses on building a culturally reflective dataset and on designing and validating the system’s effectiveness by comparing the answer quality of RAG-based models with non-RAG models. Methods Data were collected from the NAVER Knowledge iN platform, concentrating on posts that featured adolescents’ questions and corresponding expert responses during the period 2014–2024. The dataset comprises 3,874 cases, categorized by key negative emotions and the primary sources of worry. The data were processed to remove irrelevant or redundant content and then classified into general and detailed causes. The RAG-based model employed FAISS for similarity-based retrieval of the top three reference cases and used GPT-4o mini for response generation. The responses generated with and without RAG were evaluated using several metrics. Results RAG-based responses outperformed non-RAG responses across all evaluation metrics. Key findings indicate that RAG-based responses delivered more specific, empathetic, and actionable guidance, particularly when addressing complex emotional and situational concerns. The analysis revealed that family relationships, peer interactions, and academic stress are significant factors affecting adolescents’ worries, with depression and stress frequently co-occurring. Conclusions This study demonstrates the potential of RAG-based LLMs to address the diverse and culture-specific worries of Korean adolescents. By integrating external knowledge and offering personalized support, the proposed system provides a scalable approach to enhancing mental health interventions for adolescents. Future research should concentrate on expanding the dataset and improving multi-turn conversational capabilities to deliver even more comprehensive support.
ISSN:2093-3681
2093-369X