MambaCAttnGCN+: a comprehensive framework integrating MambaTextCNN, cross-attention and graph convolution network for piRNA-disease association prediction
Abstract Elucidating the interactions between piwi-interacting RNAs (piRNAs) and diseases is crucial for diagnosis and treatment. Although several computational approaches have been developed to investigate piRNA-disease associations, sparse datasets present challenges in capturing the complex relat...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-07641-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Elucidating the interactions between piwi-interacting RNAs (piRNAs) and diseases is crucial for diagnosis and treatment. Although several computational approaches have been developed to investigate piRNA-disease associations, sparse datasets present challenges in capturing the complex relationships between piRNAs and diseases. To develop a more accurate prediction model for associations between piRNAs and diseases. We integrated piRNA sequence information, disease-related semantic terms, and existing piRNA-disease association networks to construct a heterogeneous graph. Utilizing the Mamba module, we developed an innovative sequence embedding model, MambaTextCNN, to extract features from piRNA sequences, which we used as node attributes within the heterogeneous graph. A heterogeneous graph convolution method was then applied to identify potential associations between piRNAs and diseases, with cross-attention mechanisms further enhancing node features. Finally, by incorporating positive unlabeled learning techniques, we developed the piRNA-disease association prediction model MambaCAttnGCN+. In 5-fold cross-validation, MambaCAttnGCN + achieved AUCs of 0.94 and 0.953 on two datasets, outperforming seven other state-of-the-art models. Additionally, a comparison of three distinct approaches for representing sequence node features, revealed through ablation experiments that features extracted by MambaTextCNN were the most effective. MambaCAttnGCN + represents a valuable predictive tool for future research on piRNA-disease associations in biomedicine. |
|---|---|
| ISSN: | 2045-2322 |