BertADP: a fine-tuned protein language model for anti-diabetic peptide prediction
Abstract Background Diabetes is a global metabolic disease that urgently calls for the development of new and effective therapeutic agents. Anti-diabetic peptides (ADPs) have emerged as a research hotspot due to their therapeutic potential and natural safety, representing a promising class of functi...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-07-01
|
| Series: | BMC Biology |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12915-025-02312-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849331666319835136 |
|---|---|
| author | Xueqin Xie Changchun Wu Yixuan Qi Shanghua Liu Jian Huang Hao Lyu Fuying Dao Hao Lin |
| author_facet | Xueqin Xie Changchun Wu Yixuan Qi Shanghua Liu Jian Huang Hao Lyu Fuying Dao Hao Lin |
| author_sort | Xueqin Xie |
| collection | DOAJ |
| description | Abstract Background Diabetes is a global metabolic disease that urgently calls for the development of new and effective therapeutic agents. Anti-diabetic peptides (ADPs) have emerged as a research hotspot due to their therapeutic potential and natural safety, representing a promising class of functional peptides for diabetic management. However, conventional computational approaches for ADPs prediction mainly rely on manually extracted sequence features. These methods often lack generalizability and perform poorly on short peptides, thereby hindering effective ADPs discovery. Results In this study, we introduce a fine-tuning strategy of large-scale pre-trained protein language models (PLMs) for ADPs prediction, enabling automated extraction of discriminative semantic representations. We established the most comprehensive ADPs dataset to date, comprising 899 rigorously curated non-redundant ADPs and 67 newly collected potential candidates. Based on three model construction strategies, we developed 11 candidate models. Among them, BertADP (a fine-tuned ProtBert model) demonstrated superior performance in the independent test set, outperforming existing ADPs prediction tools with an overall accuracy of 0.955, sensitivity of 1.000, and specificity of 0.910. Notably, BertADP exhibited remarkable sequence length adaptability, maintaining stable performance across both standard and short peptide sequences. Conclusions BertADP represents the first PLMs-based intelligent prediction tool for ADPs, whose exceptional identification capability will significantly accelerate anti-diabetic drug development and facilitate personalized therapeutic strategies, thereby enhancing precision diabetes management. Furthermore, the proposed approach provides a generalizable framework that can be extended to other bioactive peptide discovery studies, offering an innovative solution for bioactive peptide mining. |
| format | Article |
| id | doaj-art-1003da7f4b1640bca1ac0cdae32ca660 |
| institution | Kabale University |
| issn | 1741-7007 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Biology |
| spelling | doaj-art-1003da7f4b1640bca1ac0cdae32ca6602025-08-20T03:46:27ZengBMCBMC Biology1741-70072025-07-0123111410.1186/s12915-025-02312-wBertADP: a fine-tuned protein language model for anti-diabetic peptide predictionXueqin Xie0Changchun Wu1Yixuan Qi2Shanghua Liu3Jian Huang4Hao Lyu5Fuying Dao6Hao Lin7The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of ChinaThe Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of ChinaThe Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of ChinaThe Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of ChinaThe Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of ChinaThe Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of ChinaSchool of Biological Sciences, Nanyang Technological UniversityThe Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of ChinaAbstract Background Diabetes is a global metabolic disease that urgently calls for the development of new and effective therapeutic agents. Anti-diabetic peptides (ADPs) have emerged as a research hotspot due to their therapeutic potential and natural safety, representing a promising class of functional peptides for diabetic management. However, conventional computational approaches for ADPs prediction mainly rely on manually extracted sequence features. These methods often lack generalizability and perform poorly on short peptides, thereby hindering effective ADPs discovery. Results In this study, we introduce a fine-tuning strategy of large-scale pre-trained protein language models (PLMs) for ADPs prediction, enabling automated extraction of discriminative semantic representations. We established the most comprehensive ADPs dataset to date, comprising 899 rigorously curated non-redundant ADPs and 67 newly collected potential candidates. Based on three model construction strategies, we developed 11 candidate models. Among them, BertADP (a fine-tuned ProtBert model) demonstrated superior performance in the independent test set, outperforming existing ADPs prediction tools with an overall accuracy of 0.955, sensitivity of 1.000, and specificity of 0.910. Notably, BertADP exhibited remarkable sequence length adaptability, maintaining stable performance across both standard and short peptide sequences. Conclusions BertADP represents the first PLMs-based intelligent prediction tool for ADPs, whose exceptional identification capability will significantly accelerate anti-diabetic drug development and facilitate personalized therapeutic strategies, thereby enhancing precision diabetes management. Furthermore, the proposed approach provides a generalizable framework that can be extended to other bioactive peptide discovery studies, offering an innovative solution for bioactive peptide mining.https://doi.org/10.1186/s12915-025-02312-wAnti-diabetic peptidesProtein language modelsFine-tuningBioactive peptide predictionDeep learning |
| spellingShingle | Xueqin Xie Changchun Wu Yixuan Qi Shanghua Liu Jian Huang Hao Lyu Fuying Dao Hao Lin BertADP: a fine-tuned protein language model for anti-diabetic peptide prediction BMC Biology Anti-diabetic peptides Protein language models Fine-tuning Bioactive peptide prediction Deep learning |
| title | BertADP: a fine-tuned protein language model for anti-diabetic peptide prediction |
| title_full | BertADP: a fine-tuned protein language model for anti-diabetic peptide prediction |
| title_fullStr | BertADP: a fine-tuned protein language model for anti-diabetic peptide prediction |
| title_full_unstemmed | BertADP: a fine-tuned protein language model for anti-diabetic peptide prediction |
| title_short | BertADP: a fine-tuned protein language model for anti-diabetic peptide prediction |
| title_sort | bertadp a fine tuned protein language model for anti diabetic peptide prediction |
| topic | Anti-diabetic peptides Protein language models Fine-tuning Bioactive peptide prediction Deep learning |
| url | https://doi.org/10.1186/s12915-025-02312-w |
| work_keys_str_mv | AT xueqinxie bertadpafinetunedproteinlanguagemodelforantidiabeticpeptideprediction AT changchunwu bertadpafinetunedproteinlanguagemodelforantidiabeticpeptideprediction AT yixuanqi bertadpafinetunedproteinlanguagemodelforantidiabeticpeptideprediction AT shanghualiu bertadpafinetunedproteinlanguagemodelforantidiabeticpeptideprediction AT jianhuang bertadpafinetunedproteinlanguagemodelforantidiabeticpeptideprediction AT haolyu bertadpafinetunedproteinlanguagemodelforantidiabeticpeptideprediction AT fuyingdao bertadpafinetunedproteinlanguagemodelforantidiabeticpeptideprediction AT haolin bertadpafinetunedproteinlanguagemodelforantidiabeticpeptideprediction |