DrugBERT: a BERT-based approach integrating LDA topic embedding and efficacy-aware mechanism for predicting anti-tumor drug efficacy
Abstract Background Due to the complexity of tumor genetic heterogeneity, personalized medicine has progressively emerged as the central focus of cancer research. However, how to accurately predict the drug response of patients before receiving treatment is the critical challenge to the development...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-08-01
|
| Series: | Journal of Translational Medicine |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12967-025-06795-7 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849235114556391424 |
|---|---|
| author | Weiwei Zhu Xiaodong Jiang Lei Zhang Peng Zhou Xinping Xie Hongqiang Wang |
| author_facet | Weiwei Zhu Xiaodong Jiang Lei Zhang Peng Zhou Xinping Xie Hongqiang Wang |
| author_sort | Weiwei Zhu |
| collection | DOAJ |
| description | Abstract Background Due to the complexity of tumor genetic heterogeneity, personalized medicine has progressively emerged as the central focus of cancer research. However, how to accurately predict the drug response of patients before receiving treatment is the critical challenge to the development of this field. Methods This paper proposes DrugBERT, a BERT-based framework integrated with LDA topic embedding and a drug efficacy-aware mechanism for predicting the efficacy of antitumor drugs. The method incorporates LDA-generated topic embedding as a semantic enhancement module into the BERT language model and introduces a drug efficacy-aware attention mechanism to prioritize drug efficacy-related semantic features. The model is via LSTM to capture long-range dependencies in clinical text data. In addition, the SMOTE algorithm is used to synthesize samples of the minority class to solve the problem of data imbalance. Results The proposed method DrugBERT demonstrated remarkable performance on a dataset of 958 patients with non-small cell cancer treated with antitumor drugs. Furthermore, when validated on an independent dataset of 266 bowel cancer patients, the model achieved a 3% improvement in AUC over previous methods, signifying its robust generalization capability. Conclusions DrugBERT can help predict the efficacy of antitumor drugs based on clinical text while exhibiting strong generalization capability. These findings highlight its potential for optimizing personalized therapeutic strategies through language model. |
| format | Article |
| id | doaj-art-ba307aa7f2d440d2b5a711925e58863d |
| institution | Kabale University |
| issn | 1479-5876 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | BMC |
| record_format | Article |
| series | Journal of Translational Medicine |
| spelling | doaj-art-ba307aa7f2d440d2b5a711925e58863d2025-08-20T04:02:55ZengBMCJournal of Translational Medicine1479-58762025-08-0123111110.1186/s12967-025-06795-7DrugBERT: a BERT-based approach integrating LDA topic embedding and efficacy-aware mechanism for predicting anti-tumor drug efficacyWeiwei Zhu0Xiaodong Jiang1Lei Zhang2Peng Zhou3Xinping Xie4Hongqiang Wang5University of Science and Technology of ChinaMedical Oncology Department, The First Affiliated Hospital of University of Science and Technology of ChinaDepartment of Pharmacy, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of ChinaSchool of Life Science, Hefei Normal UniversitySchool of Mathematics and Physics, Anhui Jianzhu UniversityInstitute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of SciencesAbstract Background Due to the complexity of tumor genetic heterogeneity, personalized medicine has progressively emerged as the central focus of cancer research. However, how to accurately predict the drug response of patients before receiving treatment is the critical challenge to the development of this field. Methods This paper proposes DrugBERT, a BERT-based framework integrated with LDA topic embedding and a drug efficacy-aware mechanism for predicting the efficacy of antitumor drugs. The method incorporates LDA-generated topic embedding as a semantic enhancement module into the BERT language model and introduces a drug efficacy-aware attention mechanism to prioritize drug efficacy-related semantic features. The model is via LSTM to capture long-range dependencies in clinical text data. In addition, the SMOTE algorithm is used to synthesize samples of the minority class to solve the problem of data imbalance. Results The proposed method DrugBERT demonstrated remarkable performance on a dataset of 958 patients with non-small cell cancer treated with antitumor drugs. Furthermore, when validated on an independent dataset of 266 bowel cancer patients, the model achieved a 3% improvement in AUC over previous methods, signifying its robust generalization capability. Conclusions DrugBERT can help predict the efficacy of antitumor drugs based on clinical text while exhibiting strong generalization capability. These findings highlight its potential for optimizing personalized therapeutic strategies through language model.https://doi.org/10.1186/s12967-025-06795-7Drug efficacy predictionLDA topic embeddingBERTSelf-attention mechanismClinical text data |
| spellingShingle | Weiwei Zhu Xiaodong Jiang Lei Zhang Peng Zhou Xinping Xie Hongqiang Wang DrugBERT: a BERT-based approach integrating LDA topic embedding and efficacy-aware mechanism for predicting anti-tumor drug efficacy Journal of Translational Medicine Drug efficacy prediction LDA topic embedding BERT Self-attention mechanism Clinical text data |
| title | DrugBERT: a BERT-based approach integrating LDA topic embedding and efficacy-aware mechanism for predicting anti-tumor drug efficacy |
| title_full | DrugBERT: a BERT-based approach integrating LDA topic embedding and efficacy-aware mechanism for predicting anti-tumor drug efficacy |
| title_fullStr | DrugBERT: a BERT-based approach integrating LDA topic embedding and efficacy-aware mechanism for predicting anti-tumor drug efficacy |
| title_full_unstemmed | DrugBERT: a BERT-based approach integrating LDA topic embedding and efficacy-aware mechanism for predicting anti-tumor drug efficacy |
| title_short | DrugBERT: a BERT-based approach integrating LDA topic embedding and efficacy-aware mechanism for predicting anti-tumor drug efficacy |
| title_sort | drugbert a bert based approach integrating lda topic embedding and efficacy aware mechanism for predicting anti tumor drug efficacy |
| topic | Drug efficacy prediction LDA topic embedding BERT Self-attention mechanism Clinical text data |
| url | https://doi.org/10.1186/s12967-025-06795-7 |
| work_keys_str_mv | AT weiweizhu drugbertabertbasedapproachintegratingldatopicembeddingandefficacyawaremechanismforpredictingantitumordrugefficacy AT xiaodongjiang drugbertabertbasedapproachintegratingldatopicembeddingandefficacyawaremechanismforpredictingantitumordrugefficacy AT leizhang drugbertabertbasedapproachintegratingldatopicembeddingandefficacyawaremechanismforpredictingantitumordrugefficacy AT pengzhou drugbertabertbasedapproachintegratingldatopicembeddingandefficacyawaremechanismforpredictingantitumordrugefficacy AT xinpingxie drugbertabertbasedapproachintegratingldatopicembeddingandefficacyawaremechanismforpredictingantitumordrugefficacy AT hongqiangwang drugbertabertbasedapproachintegratingldatopicembeddingandefficacyawaremechanismforpredictingantitumordrugefficacy |