MESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language models

Abstract Background Protein-protein interactions (PPIs) play a critical role in essential biological processes such as signal transduction, enzyme activity regulation, cytoskeletal structure, immune responses, and gene regulation. However, current methods mainly focus on extracting features from pro...

Full description

Saved in:
Bibliographic Details
Main Authors: Feng Wang, Jinming Chu, Liyan Shen, Shan Chang
Format: Article
Language:English
Published: BMC 2025-08-01
Series:BMC Biology
Subjects:
Online Access:https://doi.org/10.1186/s12915-025-02356-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849235603618529280
author Feng Wang
Jinming Chu
Liyan Shen
Shan Chang
author_facet Feng Wang
Jinming Chu
Liyan Shen
Shan Chang
author_sort Feng Wang
collection DOAJ
description Abstract Background Protein-protein interactions (PPIs) play a critical role in essential biological processes such as signal transduction, enzyme activity regulation, cytoskeletal structure, immune responses, and gene regulation. However, current methods mainly focus on extracting features from protein sequences and using graph neural network (GNN) to acquire interaction information from the PPI network graph. This limits the model’s ability to learn richer and more effective interaction information, thereby affecting prediction performance. Results In this study, we propose a novel deep learning method, MESM, for effectively predicting PPI. The datasets used for the PPI prediction task were primarily constructed from the STRING database, including two Homo sapiens PPI datasets, SHS27k and SHS148k, and two Saccharomyces cerevisiae PPI datasets, SYS30k and SYS60k. MESM consists of three key modules, as follows: First, MESM extracts multimodal representations from protein sequence information, protein structure information, and point cloud features through Sequence Variational Autoencoder (SVAE), Variational Graph Autoencoder (VGAE), and PointNet Autoencoder (PAE). Then, Fusion Autoencoder (FAE) is used to integrate these multimodal features, generating rich and balanced protein representations. Next, MESM leverages GraphGPS to learn structural information from the PPI network graph structure and combines Graph Attention Network (GAT) to further capture protein interaction information. Finally, MESM uses Graph Convolutional Network (GCN) and SubgraphGCN to extract global and local features from the perspective of the overall graph and subgraphs. Moreover, we build seven independent graphs from the overall PPI network graph to specifically learn the features of each PPI type, thereby enhancing the model’s learning ability for different types of interactions. Conclusions Compared to the state-of-the-art methods, MESM achieved improvements of 8.77%, 4.98%, 7.48%, and 6.08% on SHS27k, SHS148k, SYS30k, and SYS60k, respectively. The experimental results demonstrate that MESM exhibits significant improvements in PPI prediction performance.
format Article
id doaj-art-3cb364d5a3314c80840f8dff71833bd9
institution Kabale University
issn 1741-7007
language English
publishDate 2025-08-01
publisher BMC
record_format Article
series BMC Biology
spelling doaj-art-3cb364d5a3314c80840f8dff71833bd92025-08-20T04:02:44ZengBMCBMC Biology1741-70072025-08-0123112410.1186/s12915-025-02356-yMESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language modelsFeng Wang0Jinming Chu1Liyan Shen2Shan Chang3School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou UniversitySchool of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou UniversitySchool of Computer Engineering, Suzhou Vocational UniversityInstitute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of TechnologyAbstract Background Protein-protein interactions (PPIs) play a critical role in essential biological processes such as signal transduction, enzyme activity regulation, cytoskeletal structure, immune responses, and gene regulation. However, current methods mainly focus on extracting features from protein sequences and using graph neural network (GNN) to acquire interaction information from the PPI network graph. This limits the model’s ability to learn richer and more effective interaction information, thereby affecting prediction performance. Results In this study, we propose a novel deep learning method, MESM, for effectively predicting PPI. The datasets used for the PPI prediction task were primarily constructed from the STRING database, including two Homo sapiens PPI datasets, SHS27k and SHS148k, and two Saccharomyces cerevisiae PPI datasets, SYS30k and SYS60k. MESM consists of three key modules, as follows: First, MESM extracts multimodal representations from protein sequence information, protein structure information, and point cloud features through Sequence Variational Autoencoder (SVAE), Variational Graph Autoencoder (VGAE), and PointNet Autoencoder (PAE). Then, Fusion Autoencoder (FAE) is used to integrate these multimodal features, generating rich and balanced protein representations. Next, MESM leverages GraphGPS to learn structural information from the PPI network graph structure and combines Graph Attention Network (GAT) to further capture protein interaction information. Finally, MESM uses Graph Convolutional Network (GCN) and SubgraphGCN to extract global and local features from the perspective of the overall graph and subgraphs. Moreover, we build seven independent graphs from the overall PPI network graph to specifically learn the features of each PPI type, thereby enhancing the model’s learning ability for different types of interactions. Conclusions Compared to the state-of-the-art methods, MESM achieved improvements of 8.77%, 4.98%, 7.48%, and 6.08% on SHS27k, SHS148k, SYS30k, and SYS60k, respectively. The experimental results demonstrate that MESM exhibits significant improvements in PPI prediction performance.https://doi.org/10.1186/s12915-025-02356-yProtein–protein interactionMultimodal protein feature pre-trainingGraph neural network
spellingShingle Feng Wang
Jinming Chu
Liyan Shen
Shan Chang
MESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language models
BMC Biology
Protein–protein interaction
Multimodal protein feature pre-training
Graph neural network
title MESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language models
title_full MESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language models
title_fullStr MESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language models
title_full_unstemmed MESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language models
title_short MESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language models
title_sort mesm integrating multi source data for high accuracy protein protein interactions prediction through multimodal language models
topic Protein–protein interaction
Multimodal protein feature pre-training
Graph neural network
url https://doi.org/10.1186/s12915-025-02356-y
work_keys_str_mv AT fengwang mesmintegratingmultisourcedataforhighaccuracyproteinproteininteractionspredictionthroughmultimodallanguagemodels
AT jinmingchu mesmintegratingmultisourcedataforhighaccuracyproteinproteininteractionspredictionthroughmultimodallanguagemodels
AT liyanshen mesmintegratingmultisourcedataforhighaccuracyproteinproteininteractionspredictionthroughmultimodallanguagemodels
AT shanchang mesmintegratingmultisourcedataforhighaccuracyproteinproteininteractionspredictionthroughmultimodallanguagemodels