SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction

Abstract Background A massive amount of protein sequences have been obtained, but their functions remain challenging to discern. In recent research on protein function prediction, Protein-Protein Interaction (PPI) Networks have played a crucial role. Uncovering potential function relationships betwe...

Full description

Saved in:
Bibliographic Details
Main Authors: Yansong Wang, Yundong Sun, Baohui Lin, Haotian Zhang, Xiaoling Luo, Yumeng Liu, Xiaopeng Jin, Dongjie Zhu
Format: Article
Language:English
Published: BMC 2025-02-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-025-06059-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850087357032693760
author Yansong Wang
Yundong Sun
Baohui Lin
Haotian Zhang
Xiaoling Luo
Yumeng Liu
Xiaopeng Jin
Dongjie Zhu
author_facet Yansong Wang
Yundong Sun
Baohui Lin
Haotian Zhang
Xiaoling Luo
Yumeng Liu
Xiaopeng Jin
Dongjie Zhu
author_sort Yansong Wang
collection DOAJ
description Abstract Background A massive amount of protein sequences have been obtained, but their functions remain challenging to discern. In recent research on protein function prediction, Protein-Protein Interaction (PPI) Networks have played a crucial role. Uncovering potential function relationships between distant proteins within PPI networks is essential for improving the accuracy of protein function prediction. Most current studies attempt to capture these distant relationships by stacking graph network layers, but performance gains diminish as the number of layers increases. Results To further explore the potential functional relationships between multi-hop proteins in PPI networks, this paper proposes SEGT-GO, a Graph Transformer method based on PPI multi-hop neighborhood Serialization and Explainable artificial intelligence for large-scale multispecies protein function prediction. The multi-hop neighborhood serialization maps multi-hop information in the PPI Network into serialized feature embeddings, enabling the Graph Transformer to learn deeper functional features within the PPI Network. Based on game theory, the SHAP eXplainable Artificial Intelligence (XAI) framework optimizes model input and filters out feature noise, enhancing model performance. Conclusions Compared to the advanced network method DeepGraphGO, SEGT-GO achieves more competitive results in standard large-scale datasets and superior results on small ones, validating its ability to extract functional information from deep proteins. Furthermore, SEGT-GO achieves superior results in cross-species learning and prediction of the functions of unseen proteins, further proving the method’s strong generalization.
format Article
id doaj-art-eb95ddbc3d5f468a9233d42c34d077b7
institution DOAJ
issn 1471-2105
language English
publishDate 2025-02-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj-art-eb95ddbc3d5f468a9233d42c34d077b72025-08-20T02:43:13ZengBMCBMC Bioinformatics1471-21052025-02-0126112210.1186/s12859-025-06059-7SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function predictionYansong Wang0Yundong Sun1Baohui Lin2Haotian Zhang3Xiaoling Luo4Yumeng Liu5Xiaopeng Jin6Dongjie Zhu7School of Computer Science and Technology, Harbin Institute of Technology Weihai CampusSchool of Computer Science and Technology, Harbin Institute of Technology Weihai CampusCollege of Big Data and Internet, Shenzhen Technology UniversitySchool of Computer Science and Technology, Harbin Institute of Technology Weihai CampusCollege of Computer Science and Software Engineering, Shenzhen UniversityCollege of Big Data and Internet, Shenzhen Technology UniversityCollege of Big Data and Internet, Shenzhen Technology UniversitySchool of Computer Science and Technology, Harbin Institute of Technology Weihai CampusAbstract Background A massive amount of protein sequences have been obtained, but their functions remain challenging to discern. In recent research on protein function prediction, Protein-Protein Interaction (PPI) Networks have played a crucial role. Uncovering potential function relationships between distant proteins within PPI networks is essential for improving the accuracy of protein function prediction. Most current studies attempt to capture these distant relationships by stacking graph network layers, but performance gains diminish as the number of layers increases. Results To further explore the potential functional relationships between multi-hop proteins in PPI networks, this paper proposes SEGT-GO, a Graph Transformer method based on PPI multi-hop neighborhood Serialization and Explainable artificial intelligence for large-scale multispecies protein function prediction. The multi-hop neighborhood serialization maps multi-hop information in the PPI Network into serialized feature embeddings, enabling the Graph Transformer to learn deeper functional features within the PPI Network. Based on game theory, the SHAP eXplainable Artificial Intelligence (XAI) framework optimizes model input and filters out feature noise, enhancing model performance. Conclusions Compared to the advanced network method DeepGraphGO, SEGT-GO achieves more competitive results in standard large-scale datasets and superior results on small ones, validating its ability to extract functional information from deep proteins. Furthermore, SEGT-GO achieves superior results in cross-species learning and prediction of the functions of unseen proteins, further proving the method’s strong generalization.https://doi.org/10.1186/s12859-025-06059-7Protein function predictionGraph transformerPPI networksMulti-hop neighborhood serializationExplainable artificial intelligence
spellingShingle Yansong Wang
Yundong Sun
Baohui Lin
Haotian Zhang
Xiaoling Luo
Yumeng Liu
Xiaopeng Jin
Dongjie Zhu
SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction
BMC Bioinformatics
Protein function prediction
Graph transformer
PPI networks
Multi-hop neighborhood serialization
Explainable artificial intelligence
title SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction
title_full SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction
title_fullStr SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction
title_full_unstemmed SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction
title_short SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction
title_sort segt go a graph transformer method based on ppi serialization and explanatory artificial intelligence for protein function prediction
topic Protein function prediction
Graph transformer
PPI networks
Multi-hop neighborhood serialization
Explainable artificial intelligence
url https://doi.org/10.1186/s12859-025-06059-7
work_keys_str_mv AT yansongwang segtgoagraphtransformermethodbasedonppiserializationandexplanatoryartificialintelligenceforproteinfunctionprediction
AT yundongsun segtgoagraphtransformermethodbasedonppiserializationandexplanatoryartificialintelligenceforproteinfunctionprediction
AT baohuilin segtgoagraphtransformermethodbasedonppiserializationandexplanatoryartificialintelligenceforproteinfunctionprediction
AT haotianzhang segtgoagraphtransformermethodbasedonppiserializationandexplanatoryartificialintelligenceforproteinfunctionprediction
AT xiaolingluo segtgoagraphtransformermethodbasedonppiserializationandexplanatoryartificialintelligenceforproteinfunctionprediction
AT yumengliu segtgoagraphtransformermethodbasedonppiserializationandexplanatoryartificialintelligenceforproteinfunctionprediction
AT xiaopengjin segtgoagraphtransformermethodbasedonppiserializationandexplanatoryartificialintelligenceforproteinfunctionprediction
AT dongjiezhu segtgoagraphtransformermethodbasedonppiserializationandexplanatoryartificialintelligenceforproteinfunctionprediction