Deep generalizable prediction of RNA secondary structure via base pair motif energy

Abstract Deep learning methods have demonstrated great performance for RNA secondary structure prediction. However, generalizability is a common unsolved issue on unseen out-of-distribution RNA families, which hinders further improvement of the accuracy and robustness of deep learning methods. Here...

Full description

Saved in:
Bibliographic Details
Main Authors: Heqin Zhu, Fenghe Tang, Quan Quan, Ke Chen, Peng Xiong, S. Kevin Zhou
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-025-60048-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849402106648199168
author Heqin Zhu
Fenghe Tang
Quan Quan
Ke Chen
Peng Xiong
S. Kevin Zhou
author_facet Heqin Zhu
Fenghe Tang
Quan Quan
Ke Chen
Peng Xiong
S. Kevin Zhou
author_sort Heqin Zhu
collection DOAJ
description Abstract Deep learning methods have demonstrated great performance for RNA secondary structure prediction. However, generalizability is a common unsolved issue on unseen out-of-distribution RNA families, which hinders further improvement of the accuracy and robustness of deep learning methods. Here we construct a base pair motif library that enumerates the complete space of the locally adjacent three-neighbor base pair and records the thermodynamic energy of corresponding base pair motifs through de novo modeling of tertiary structures, and we further develop a deep learning approach for RNA secondary structure prediction, named BPfold, which learns relationship between RNA sequence and the energy map of base pair motif. Experiments on sequence-wise and family-wise datasets have demonstrated the great superiority of BPfold compared to other state-of-the-art approaches in accuracy and generalizability. We hope this work contributes to integrating physical priors and deep learning methods for the further discovery of RNA structures and functionalities.
format Article
id doaj-art-e72e90e187784a47a5cd0723cfe50987
institution Kabale University
issn 2041-1723
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-e72e90e187784a47a5cd0723cfe509872025-08-20T03:37:37ZengNature PortfolioNature Communications2041-17232025-07-0116111310.1038/s41467-025-60048-1Deep generalizable prediction of RNA secondary structure via base pair motif energyHeqin Zhu0Fenghe Tang1Quan Quan2Ke Chen3Peng Xiong4S. Kevin Zhou5School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC)School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC)Key Laboratory of Intelligent Information Processing of Institute of Computing Technology, Chinese Academy of SciencesSchool of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC)School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC)School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC)Abstract Deep learning methods have demonstrated great performance for RNA secondary structure prediction. However, generalizability is a common unsolved issue on unseen out-of-distribution RNA families, which hinders further improvement of the accuracy and robustness of deep learning methods. Here we construct a base pair motif library that enumerates the complete space of the locally adjacent three-neighbor base pair and records the thermodynamic energy of corresponding base pair motifs through de novo modeling of tertiary structures, and we further develop a deep learning approach for RNA secondary structure prediction, named BPfold, which learns relationship between RNA sequence and the energy map of base pair motif. Experiments on sequence-wise and family-wise datasets have demonstrated the great superiority of BPfold compared to other state-of-the-art approaches in accuracy and generalizability. We hope this work contributes to integrating physical priors and deep learning methods for the further discovery of RNA structures and functionalities.https://doi.org/10.1038/s41467-025-60048-1
spellingShingle Heqin Zhu
Fenghe Tang
Quan Quan
Ke Chen
Peng Xiong
S. Kevin Zhou
Deep generalizable prediction of RNA secondary structure via base pair motif energy
Nature Communications
title Deep generalizable prediction of RNA secondary structure via base pair motif energy
title_full Deep generalizable prediction of RNA secondary structure via base pair motif energy
title_fullStr Deep generalizable prediction of RNA secondary structure via base pair motif energy
title_full_unstemmed Deep generalizable prediction of RNA secondary structure via base pair motif energy
title_short Deep generalizable prediction of RNA secondary structure via base pair motif energy
title_sort deep generalizable prediction of rna secondary structure via base pair motif energy
url https://doi.org/10.1038/s41467-025-60048-1
work_keys_str_mv AT heqinzhu deepgeneralizablepredictionofrnasecondarystructureviabasepairmotifenergy
AT fenghetang deepgeneralizablepredictionofrnasecondarystructureviabasepairmotifenergy
AT quanquan deepgeneralizablepredictionofrnasecondarystructureviabasepairmotifenergy
AT kechen deepgeneralizablepredictionofrnasecondarystructureviabasepairmotifenergy
AT pengxiong deepgeneralizablepredictionofrnasecondarystructureviabasepairmotifenergy
AT skevinzhou deepgeneralizablepredictionofrnasecondarystructureviabasepairmotifenergy