PBertKla: a protein large language model for predicting human lysine lactylation sites

Abstract Background Lactylation is a newly discovered type of post-translational modification, primarily occurring on lysine (K) residues of both histones and non-histones to exert diverse effects on target proteins. Research has shown that lysine lactylation (Kla) modification is ubiquitous in diff...

Full description

Saved in:
Bibliographic Details
Main Authors: Hongyan Lai, Diyu Luo, Mi Yang, Tao Zhu, Huan Yang, Xinwei Luo, Yijie Wei, Sijia Xie, Feitong Hong, Kunxian Shu, Fuying Dao, Hui Ding
Format: Article
Language:English
Published: BMC 2025-04-01
Series:BMC Biology
Subjects:
Online Access:https://doi.org/10.1186/s12915-025-02202-1
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Lactylation is a newly discovered type of post-translational modification, primarily occurring on lysine (K) residues of both histones and non-histones to exert diverse effects on target proteins. Research has shown that lysine lactylation (Kla) modification is ubiquitous in different cells and participates in the determination of cell function and fate, as well as in the initiation and progression of various diseases. Precise identification of Kla sites is fundamental for elucidating their biological functions and uncovering their application potential. Results Here, we proposed a novel human Kla site predictor (named PBertKla) through curating a reliable benchmark dataset with proper sample length and sequence identity threshold to train a protein large language model with optimal hyperparameters. Extensive experimental results consistently demonstrated that our model possessed robust human Kla site prediction ability, achieving an AUC (area under receiver operating characteristic curve) value of over 0.880 on the independent validation data. Feature visualization analysis further validated the effectiveness of in feature learning and representation from Kla sequences. Moreover, we benchmarked PBertKla against other cutting-edge models on an independent testing dataset from different sources, highlighting its superiority and transferability. Conclusions All results indicated that PBertKla excelled as an automatic predictor of human Kla sites, and it would advance the investigation of lactylation modifications and their significance in health and disease.
ISSN:1741-7007