Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations

Abstract Digital health technologies hold significant potential for reducing global healthcare disparities. Large language models (LLMs) offer new opportunities to enhance access to culturally specific healthcare, including traditional Chinese medicine (TCM). This study evaluated the diagnostic and...

Full description

Saved in:
Bibliographic Details
Main Authors: Yu Liu, Yishan Yuan, Keming Yan, Yuanyuan Li, Valeria Sacca, Sierra Hodges, Mattia Cannistra, Pauline Jeong, Jiani Wu, Jian Kong
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-025-01845-2
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849342088610578432
author Yu Liu
Yishan Yuan
Keming Yan
Yuanyuan Li
Valeria Sacca
Sierra Hodges
Mattia Cannistra
Pauline Jeong
Jiani Wu
Jian Kong
author_facet Yu Liu
Yishan Yuan
Keming Yan
Yuanyuan Li
Valeria Sacca
Sierra Hodges
Mattia Cannistra
Pauline Jeong
Jiani Wu
Jian Kong
author_sort Yu Liu
collection DOAJ
description Abstract Digital health technologies hold significant potential for reducing global healthcare disparities. Large language models (LLMs) offer new opportunities to enhance access to culturally specific healthcare, including traditional Chinese medicine (TCM). This study evaluated the diagnostic and treatment performance of seven publicly available LLMs using a real-world acupuncture case, comparing their outputs with three professional acupuncturists across five domains: Western diagnosis, TCM diagnosis, acupoint selection, needling technique, and herbal medicine. Twenty-eight expert evaluators from China, South Korea, and the United States assessed the responses using a multilingual survey. LLMs performed comparably to acupuncturists in Western diagnosis and showed variable performance in TCM-specific tasks. GPT-4o, Qwen 2.5 Max, and Doubao 1.5 Pro demonstrated the highest alignment with expert evaluations, particularly in TCM diagnosis and acupoint selection. These findings highlight the potential of general-purpose LLMs to support culturally grounded medical decision-making and reduce access barriers in TCM care systems.
format Article
id doaj-art-950d30ea6e0e4b05a0562a6766f33d23
institution Kabale University
issn 2398-6352
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj-art-950d30ea6e0e4b05a0562a6766f33d232025-08-20T03:43:30ZengNature Portfolionpj Digital Medicine2398-63522025-07-018111210.1038/s41746-025-01845-2Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendationsYu Liu0Yishan Yuan1Keming Yan2Yuanyuan Li3Valeria Sacca4Sierra Hodges5Mattia Cannistra6Pauline Jeong7Jiani Wu8Jian Kong9Department of Psychiatry, Massachusetts General Hospital and Harvard Medical SchoolBeijing University of Chinese MedicineDepartment of Psychiatry, Massachusetts General Hospital and Harvard Medical SchoolDepartment of Psychiatry, Massachusetts General Hospital and Harvard Medical SchoolDepartment of Psychiatry, Massachusetts General Hospital and Harvard Medical SchoolDepartment of Psychiatry, Massachusetts General Hospital and Harvard Medical SchoolDepartment of Psychiatry, Massachusetts General Hospital and Harvard Medical SchoolDepartment of Psychiatry, Massachusetts General Hospital and Harvard Medical SchoolDepartment of Psychiatry, Massachusetts General Hospital and Harvard Medical SchoolDepartment of Psychiatry, Massachusetts General Hospital and Harvard Medical SchoolAbstract Digital health technologies hold significant potential for reducing global healthcare disparities. Large language models (LLMs) offer new opportunities to enhance access to culturally specific healthcare, including traditional Chinese medicine (TCM). This study evaluated the diagnostic and treatment performance of seven publicly available LLMs using a real-world acupuncture case, comparing their outputs with three professional acupuncturists across five domains: Western diagnosis, TCM diagnosis, acupoint selection, needling technique, and herbal medicine. Twenty-eight expert evaluators from China, South Korea, and the United States assessed the responses using a multilingual survey. LLMs performed comparably to acupuncturists in Western diagnosis and showed variable performance in TCM-specific tasks. GPT-4o, Qwen 2.5 Max, and Doubao 1.5 Pro demonstrated the highest alignment with expert evaluations, particularly in TCM diagnosis and acupoint selection. These findings highlight the potential of general-purpose LLMs to support culturally grounded medical decision-making and reduce access barriers in TCM care systems.https://doi.org/10.1038/s41746-025-01845-2
spellingShingle Yu Liu
Yishan Yuan
Keming Yan
Yuanyuan Li
Valeria Sacca
Sierra Hodges
Mattia Cannistra
Pauline Jeong
Jiani Wu
Jian Kong
Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations
npj Digital Medicine
title Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations
title_full Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations
title_fullStr Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations
title_full_unstemmed Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations
title_short Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations
title_sort evaluating the role of large language models in traditional chinese medicine diagnosis and treatment recommendations
url https://doi.org/10.1038/s41746-025-01845-2
work_keys_str_mv AT yuliu evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations
AT yishanyuan evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations
AT kemingyan evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations
AT yuanyuanli evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations
AT valeriasacca evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations
AT sierrahodges evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations
AT mattiacannistra evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations
AT paulinejeong evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations
AT jianiwu evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations
AT jiankong evaluatingtheroleoflargelanguagemodelsintraditionalchinesemedicinediagnosisandtreatmentrecommendations