MambaCAttnGCN+: a comprehensive framework integrating MambaTextCNN, cross-attention and graph convolution network for piRNA-disease association prediction

Abstract Elucidating the interactions between piwi-interacting RNAs (piRNAs) and diseases is crucial for diagnosis and treatment. Although several computational approaches have been developed to investigate piRNA-disease associations, sparse datasets present challenges in capturing the complex relat...

Full description

Saved in:
Bibliographic Details
Main Authors: Dengju Yao, Xiangkui Li, Xiaojuan Zhan, Bo Zhang, Jian Zhang
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-07641-y
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Elucidating the interactions between piwi-interacting RNAs (piRNAs) and diseases is crucial for diagnosis and treatment. Although several computational approaches have been developed to investigate piRNA-disease associations, sparse datasets present challenges in capturing the complex relationships between piRNAs and diseases. To develop a more accurate prediction model for associations between piRNAs and diseases. We integrated piRNA sequence information, disease-related semantic terms, and existing piRNA-disease association networks to construct a heterogeneous graph. Utilizing the Mamba module, we developed an innovative sequence embedding model, MambaTextCNN, to extract features from piRNA sequences, which we used as node attributes within the heterogeneous graph. A heterogeneous graph convolution method was then applied to identify potential associations between piRNAs and diseases, with cross-attention mechanisms further enhancing node features. Finally, by incorporating positive unlabeled learning techniques, we developed the piRNA-disease association prediction model MambaCAttnGCN+. In 5-fold cross-validation, MambaCAttnGCN + achieved AUCs of 0.94 and 0.953 on two datasets, outperforming seven other state-of-the-art models. Additionally, a comparison of three distinct approaches for representing sequence node features, revealed through ablation experiments that features extracted by MambaTextCNN were the most effective. MambaCAttnGCN + represents a valuable predictive tool for future research on piRNA-disease associations in biomedicine.
ISSN:2045-2322