Anticancer drug synergy prediction based on CatBoost
Background The research of cancer treatments has always been a hot topic in the medical field. Multi-targeted combination drugs have been considered as an ideal option for cancer treatment. Since it is not feasible to use clinical experience or high-throughput screening to identify the complete comb...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
PeerJ Inc.
2025-05-01
|
| Series: | PeerJ Computer Science |
| Subjects: | |
| Online Access: | https://peerj.com/articles/cs-2829.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Background The research of cancer treatments has always been a hot topic in the medical field. Multi-targeted combination drugs have been considered as an ideal option for cancer treatment. Since it is not feasible to use clinical experience or high-throughput screening to identify the complete combinatorial space, methods such as machine learning models offer the possibility to explore the combinatorial space effectively. Methods In this work, we proposed a machine learning method based on CatBoost to predict the synergy scores of anticancer drug combinations on cancer cell lines, which utilized oblivious trees and ordered boosting technique to avoid overfitting and bias. The model was trained and tested using the data screened from NCI-ALMANAC dataset. The drugs were characterized with morgan fingerprints, drug target information, monotherapy information, and the cell lines were described with gene expression profiles. Results In the stratified 5-fold cross-validation, our method obtained excellent results, where, the receiver operating characteristic area under the curve (ROC AUC) is 0.9217, precision-recall area under the curve (PR AUC) is 0.4651, mean squared error (MSE) is 0.1365, and Pearson correlation coefficient is 0.5335. The performance is significantly better than three other advanced models. Additionally, when using SHapley Additive exPlanations (SHAP) to interpret the biological significance of the prediction results, we found that drug features played more prominent roles than cell line features, and genes associated with cancer development, such as PTK2, CCND1, and GNA11, played an important part in drug synergy prediction. Combining the experimental results, the model proposed in this study has a good prediction effect and can be used as an alternative method for predicting anticancer drug combinations. |
|---|---|
| ISSN: | 2376-5992 |