SuperEdgeGO: Edge-supervised graph representation learning for enhanced protein function prediction.

Understanding the functions of proteins is of great importance for deciphering the mechanisms of life activities. To date, there have been over 200 million known proteins, but only 0.2% of them have well-annotated functional terms. By measuring the contacts among residues, proteins can be described...

Full description

Saved in:
Bibliographic Details
Main Authors: Shugang Zhang, Yuntong Li, Wenjian Ma, Qing Cai, Jing Qin, Xiangpeng Bi, Huasen Jiang, Xiaoyu Huang, Zhiqiang Wei
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-08-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1013343
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Understanding the functions of proteins is of great importance for deciphering the mechanisms of life activities. To date, there have been over 200 million known proteins, but only 0.2% of them have well-annotated functional terms. By measuring the contacts among residues, proteins can be described as graphs so that the graph leaning approaches can be applied to learn protein representations. However, existing graph-based methods put efforts in enriching the residue node information and did not fully exploit the edge information, which leads to suboptimal representations considering the strong association of residue contacts to protein structures and to the functions. In this article, we propose SuperEdgeGO, which introduces the supervision of edges in protein graphs to learn a better graph representation for protein function prediction. Different from common graph convolution methods that uses edge information in a plain or unsupervised way, we introduce a supervised attention to encode the residue contacts explicitly into the protein representation. Comprehensive experiments demonstrate that SuperEdgeGO achieves state-of-the-art performance on all three categories of protein functions. Additional ablation analysis further proves the effectiveness of the devised edge supervision strategy. The implementation of edge supervision in SuperEdgeGO resulted in enhanced graph representations for protein function prediction, as demonstrated by its superior performance across all the evaluated categories. This superior performance was confirmed through ablation analysis, which validated the effectiveness of the edge supervision strategy. This strategy has a broad application prospect in the study of protein function and related fields.
ISSN:1553-734X
1553-7358