π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing
Abstract Peptide sequencing via tandem mass spectrometry (MS/MS) is essential in proteomics. Unlike traditional database searches, deep learning excels at de novo peptide sequencing, even for peptides missing from existing databases. Current deep learning models often rely on autoregressive generati...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-01-01
|
| Series: | Nature Communications |
| Online Access: | https://doi.org/10.1038/s41467-024-55021-3 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850048768021364736 |
|---|---|
| author | Xiang Zhang Tianze Ling Zhi Jin Sheng Xu Zhiqiang Gao Boyan Sun Zijie Qiu Jiaqi Wei Nanqing Dong Guangshuai Wang Guibin Wang Leyuan Li Muhammad Abdul-Mageed Laks V. S. Lakshmanan Fuchu He Wanli Ouyang Cheng Chang Siqi Sun |
| author_facet | Xiang Zhang Tianze Ling Zhi Jin Sheng Xu Zhiqiang Gao Boyan Sun Zijie Qiu Jiaqi Wei Nanqing Dong Guangshuai Wang Guibin Wang Leyuan Li Muhammad Abdul-Mageed Laks V. S. Lakshmanan Fuchu He Wanli Ouyang Cheng Chang Siqi Sun |
| author_sort | Xiang Zhang |
| collection | DOAJ |
| description | Abstract Peptide sequencing via tandem mass spectrometry (MS/MS) is essential in proteomics. Unlike traditional database searches, deep learning excels at de novo peptide sequencing, even for peptides missing from existing databases. Current deep learning models often rely on autoregressive generation, which suffers from error accumulation and slow inference speeds. In this work, we introduce π-PrimeNovo, a non-autoregressive Transformer-based model for peptide sequencing. With our architecture design and a CUDA-enhanced decoding module for precise mass control, π-PrimeNovo achieves significantly higher accuracy and up to 89x faster inference than state-of-the-art methods, making it ideal for large-scale applications like metaproteomics. Additionally, it excels in phosphopeptide mining and detecting low-abundance post-translational modifications (PTMs), marking a substantial advance in peptide sequencing with broad potential in biological research. |
| format | Article |
| id | doaj-art-af31071132d349ff803c1ce41beccd15 |
| institution | DOAJ |
| issn | 2041-1723 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Nature Communications |
| spelling | doaj-art-af31071132d349ff803c1ce41beccd152025-08-20T02:53:53ZengNature PortfolioNature Communications2041-17232025-01-0116111610.1038/s41467-024-55021-3π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencingXiang Zhang0Tianze Ling1Zhi Jin2Sheng Xu3Zhiqiang Gao4Boyan Sun5Zijie Qiu6Jiaqi Wei7Nanqing Dong8Guangshuai Wang9Guibin Wang10Leyuan Li11Muhammad Abdul-Mageed12Laks V. S. Lakshmanan13Fuchu He14Wanli Ouyang15Cheng Chang16Siqi Sun17Shanghai Artificial Intelligence LaboratoryTsinghua UniversityShanghai Artificial Intelligence LaboratoryShanghai Artificial Intelligence LaboratoryShanghai Artificial Intelligence LaboratoryState Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of LifeomicsShanghai Artificial Intelligence LaboratoryShanghai Artificial Intelligence LaboratoryShanghai Artificial Intelligence LaboratoryShanghai Artificial Intelligence LaboratoryState Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of LifeomicsState Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of LifeomicsUniversity of British ColumbiaUniversity of British ColumbiaState Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of LifeomicsShanghai Artificial Intelligence LaboratoryState Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of LifeomicsResearch Institute of Intelligent Complex Systems, Fudan UniversityAbstract Peptide sequencing via tandem mass spectrometry (MS/MS) is essential in proteomics. Unlike traditional database searches, deep learning excels at de novo peptide sequencing, even for peptides missing from existing databases. Current deep learning models often rely on autoregressive generation, which suffers from error accumulation and slow inference speeds. In this work, we introduce π-PrimeNovo, a non-autoregressive Transformer-based model for peptide sequencing. With our architecture design and a CUDA-enhanced decoding module for precise mass control, π-PrimeNovo achieves significantly higher accuracy and up to 89x faster inference than state-of-the-art methods, making it ideal for large-scale applications like metaproteomics. Additionally, it excels in phosphopeptide mining and detecting low-abundance post-translational modifications (PTMs), marking a substantial advance in peptide sequencing with broad potential in biological research.https://doi.org/10.1038/s41467-024-55021-3 |
| spellingShingle | Xiang Zhang Tianze Ling Zhi Jin Sheng Xu Zhiqiang Gao Boyan Sun Zijie Qiu Jiaqi Wei Nanqing Dong Guangshuai Wang Guibin Wang Leyuan Li Muhammad Abdul-Mageed Laks V. S. Lakshmanan Fuchu He Wanli Ouyang Cheng Chang Siqi Sun π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing Nature Communications |
| title | π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing |
| title_full | π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing |
| title_fullStr | π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing |
| title_full_unstemmed | π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing |
| title_short | π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing |
| title_sort | π primenovo an accurate and efficient non autoregressive deep learning model for de novo peptide sequencing |
| url | https://doi.org/10.1038/s41467-024-55021-3 |
| work_keys_str_mv | AT xiangzhang pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT tianzeling pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT zhijin pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT shengxu pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT zhiqianggao pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT boyansun pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT zijieqiu pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT jiaqiwei pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT nanqingdong pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT guangshuaiwang pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT guibinwang pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT leyuanli pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT muhammadabdulmageed pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT laksvslakshmanan pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT fuchuhe pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT wanliouyang pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT chengchang pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing AT siqisun pprimenovoanaccurateandefficientnonautoregressivedeeplearningmodelfordenovopeptidesequencing |