TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method

Objective: To improve the accuracy and professionalism of question-answering (QA) model in traditional Chinese medicine (TCM) lung cancer by integrating large language models with structured knowledge graphs using the knowledge graph (KG) to text-enhanced retrieval-augmented generation (KG2TRAG) met...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhou Chunfang, Gong Qingyue, Zhan Wendong, Zhu Jinyang, Luan Huidan
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2025-03-01
Series:Digital Chinese Medicine
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2589377725000291
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849321402485702656
author Zhou Chunfang
Gong Qingyue
Zhan Wendong
Zhu Jinyang
Luan Huidan
author_facet Zhou Chunfang
Gong Qingyue
Zhan Wendong
Zhu Jinyang
Luan Huidan
author_sort Zhou Chunfang
collection DOAJ
description Objective: To improve the accuracy and professionalism of question-answering (QA) model in traditional Chinese medicine (TCM) lung cancer by integrating large language models with structured knowledge graphs using the knowledge graph (KG) to text-enhanced retrieval-augmented generation (KG2TRAG) method. Methods: The TCM lung cancer model (TCMLCM) was constructed by fine-tuning ChatGLM2-6B on the specialized datasets Tianchi TCM, HuangDi, and ShenNong-TCM-Dataset, as well as a TCM lung cancer KG. The KG2TRAG method was applied to enhance the knowledge retrieval, which can convert KG triples into natural language text via ChatGPT-aided linearization, leveraging large language models (LLMs) for context-aware reasoning. For a comprehensive comparison, MedicalGPT, HuatuoGPT, and BenTsao were selected as the baseline models. Performance was evaluated using bilingual evaluation understudy (BLEU), recall-oriented understudy for gisting evaluation (ROUGE), accuracy, and the domain-specific TCM-LCEval metrics, with validation from TCM oncology experts assessing answer accuracy, professionalism, and usability. Results: The TCMLCM model achieved the optimal performance across all metrics, including a BLEU score of 32.15%, ROUGE-L of 59.08%, and an accuracy rate of 79.68%. Notably, in the TCM-LCEval assessment specific to the field of TCM, its performance was 3% − 12% higher than that of the baseline model. Expert evaluations highlighted superior performance in accuracy and professionalism. Conclusion: TCMLCM can provide an innovative solution for TCM lung cancer QA, demonstrating the feasibility of integrating structured KGs with LLMs. This work advances intelligent TCM healthcare tools and lays a foundation for future AI-driven applications in traditional medicine.
format Article
id doaj-art-ddd9662a244a4247b8085bb6f52e2817
institution Kabale University
issn 2589-3777
language English
publishDate 2025-03-01
publisher KeAi Communications Co., Ltd.
record_format Article
series Digital Chinese Medicine
spelling doaj-art-ddd9662a244a4247b8085bb6f52e28172025-08-20T03:49:45ZengKeAi Communications Co., Ltd.Digital Chinese Medicine2589-37772025-03-0181364510.1016/j.dcmed.2025.03.011TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG methodZhou Chunfang0Gong Qingyue1Zhan Wendong2Zhu Jinyang3Luan Huidan4School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing, Jiangsu 210023, ChinaSchool of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing, Jiangsu 210023, China; Jiangsu Province Engineering Research Center of TCM Intelligence Health Service, Nanjing University of Chinese Medicine, Nanjing, Jiangsu 210023, China; Corresponding author:School of Life Science, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing, Jiangsu 210023, ChinaSchool of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing, Jiangsu 210023, ChinaObjective: To improve the accuracy and professionalism of question-answering (QA) model in traditional Chinese medicine (TCM) lung cancer by integrating large language models with structured knowledge graphs using the knowledge graph (KG) to text-enhanced retrieval-augmented generation (KG2TRAG) method. Methods: The TCM lung cancer model (TCMLCM) was constructed by fine-tuning ChatGLM2-6B on the specialized datasets Tianchi TCM, HuangDi, and ShenNong-TCM-Dataset, as well as a TCM lung cancer KG. The KG2TRAG method was applied to enhance the knowledge retrieval, which can convert KG triples into natural language text via ChatGPT-aided linearization, leveraging large language models (LLMs) for context-aware reasoning. For a comprehensive comparison, MedicalGPT, HuatuoGPT, and BenTsao were selected as the baseline models. Performance was evaluated using bilingual evaluation understudy (BLEU), recall-oriented understudy for gisting evaluation (ROUGE), accuracy, and the domain-specific TCM-LCEval metrics, with validation from TCM oncology experts assessing answer accuracy, professionalism, and usability. Results: The TCMLCM model achieved the optimal performance across all metrics, including a BLEU score of 32.15%, ROUGE-L of 59.08%, and an accuracy rate of 79.68%. Notably, in the TCM-LCEval assessment specific to the field of TCM, its performance was 3% − 12% higher than that of the baseline model. Expert evaluations highlighted superior performance in accuracy and professionalism. Conclusion: TCMLCM can provide an innovative solution for TCM lung cancer QA, demonstrating the feasibility of integrating structured KGs with LLMs. This work advances intelligent TCM healthcare tools and lays a foundation for future AI-driven applications in traditional medicine.http://www.sciencedirect.com/science/article/pii/S2589377725000291Traditional Chinese medicine (TCM)Lung cancerQuestion-answeringLarge language modelFine-tuningKnowledge graph
spellingShingle Zhou Chunfang
Gong Qingyue
Zhan Wendong
Zhu Jinyang
Luan Huidan
TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method
Digital Chinese Medicine
Traditional Chinese medicine (TCM)
Lung cancer
Question-answering
Large language model
Fine-tuning
Knowledge graph
title TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method
title_full TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method
title_fullStr TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method
title_full_unstemmed TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method
title_short TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method
title_sort tcmlcm an intelligent question answering model for traditional chinese medicine lung cancer based on the kg2trag method
topic Traditional Chinese medicine (TCM)
Lung cancer
Question-answering
Large language model
Fine-tuning
Knowledge graph
url http://www.sciencedirect.com/science/article/pii/S2589377725000291
work_keys_str_mv AT zhouchunfang tcmlcmanintelligentquestionansweringmodelfortraditionalchinesemedicinelungcancerbasedonthekg2tragmethod
AT gongqingyue tcmlcmanintelligentquestionansweringmodelfortraditionalchinesemedicinelungcancerbasedonthekg2tragmethod
AT zhanwendong tcmlcmanintelligentquestionansweringmodelfortraditionalchinesemedicinelungcancerbasedonthekg2tragmethod
AT zhujinyang tcmlcmanintelligentquestionansweringmodelfortraditionalchinesemedicinelungcancerbasedonthekg2tragmethod
AT luanhuidan tcmlcmanintelligentquestionansweringmodelfortraditionalchinesemedicinelungcancerbasedonthekg2tragmethod