DrugBERT: a BERT-based approach integrating LDA topic embedding and efficacy-aware mechanism for predicting anti-tumor drug efficacy

Abstract Background Due to the complexity of tumor genetic heterogeneity, personalized medicine has progressively emerged as the central focus of cancer research. However, how to accurately predict the drug response of patients before receiving treatment is the critical challenge to the development...

Full description

Saved in:
Bibliographic Details
Main Authors: Weiwei Zhu, Xiaodong Jiang, Lei Zhang, Peng Zhou, Xinping Xie, Hongqiang Wang
Format: Article
Language:English
Published: BMC 2025-08-01
Series:Journal of Translational Medicine
Subjects:
Online Access:https://doi.org/10.1186/s12967-025-06795-7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Due to the complexity of tumor genetic heterogeneity, personalized medicine has progressively emerged as the central focus of cancer research. However, how to accurately predict the drug response of patients before receiving treatment is the critical challenge to the development of this field. Methods This paper proposes DrugBERT, a BERT-based framework integrated with LDA topic embedding and a drug efficacy-aware mechanism for predicting the efficacy of antitumor drugs. The method incorporates LDA-generated topic embedding as a semantic enhancement module into the BERT language model and introduces a drug efficacy-aware attention mechanism to prioritize drug efficacy-related semantic features. The model is via LSTM to capture long-range dependencies in clinical text data. In addition, the SMOTE algorithm is used to synthesize samples of the minority class to solve the problem of data imbalance. Results The proposed method DrugBERT demonstrated remarkable performance on a dataset of 958 patients with non-small cell cancer treated with antitumor drugs. Furthermore, when validated on an independent dataset of 266 bowel cancer patients, the model achieved a 3% improvement in AUC over previous methods, signifying its robust generalization capability. Conclusions DrugBERT can help predict the efficacy of antitumor drugs based on clinical text while exhibiting strong generalization capability. These findings highlight its potential for optimizing personalized therapeutic strategies through language model.
ISSN:1479-5876