PRDAGE: a prescription recommendation framework for traditional Chinese medicine based on data augmentation and multi-graph embedding

Background The prescriptions of traditional chinese medicine (TCM) have made a great contribution to the treatment of disease and the maintenance of good health. Current research on prescription recommendations mainly focuses on the correlation between symptoms and herbs. However, the semantic infor...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhihua Wen, Yunchun Dong, Lihong Peng, Longxin Zhang, Junfeng Yan
Format: Article
Language:English
Published: PeerJ Inc. 2025-08-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-2974.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background The prescriptions of traditional chinese medicine (TCM) have made a great contribution to the treatment of disease and the maintenance of good health. Current research on prescription recommendations mainly focuses on the correlation between symptoms and herbs. However, the semantic information inherent in both symptoms and herbs has received limited attention. Furthermore, most datasets in the field of TCM suffer from limited data volumes, which can adversely impact model training. Methods To tackle these challenges, we present a prescription recommendation framework called PRDAGE, which is based on data augmentation and multi-graph embedding. We started by collecting medical records and creating a dataset of 3,052 classic medical cases, where we normalized the symptoms and herbs. Additionally, we developed a multi-layer embedding method for symptoms and herbs, using Sentence Bert (SBert) and graph convolutional networks. The aim of this multi-layer embedding method is to capture and represent the semantic information of symptoms and herbs, as well as the complex relationships between them. Additionally, a median-based random data augmentation method was introduced to enrich the medical case data, effectively enhancing the model’s accuracy. Results The model was evaluated against baseline models on an unenhanced dataset (Dataset-B), and the results showed that the proposed PRDAGE framework exhibited superior overall performance. Compared to the second-best model, PRDAGE achieved improvements in accuracy and recall rates of 1.69% and 3.80%, respectively, on the Top@10 metric. Ablation experiments further revealed that both the data augmentation and multi-layer embedding modules contributed to the improved model performance. Conclusion In conclusion, the experimental results suggest that PRDAGE is an effective prescription recommendation framework. The multi-layer embedding approach effectively represents the semantic information of symptoms and the complex relationships between symptoms and herbs. Additionally, the use of median-based data augmentation has a positive impact on the overall performance and generalization ability of the model.
ISSN:2376-5992