Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
Palaeoproteomics is a rapidly evolving discipline, and practitioners are constantly developing novel strategies for the analyses and interpretations of complex, degraded protein mixtures. The community has also established standards of good practice to interrogate our data. However, there is a lack...
Saved in:
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Peer Community In
2024-11-01
|
Series: | Peer Community Journal |
Subjects: | |
Online Access: | https://peercommunityjournal.org/articles/10.24072/pcjournal.491/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1825206372430512128 |
---|---|
author | Rodriguez Palomo, Ismael Nair, Bharath Chiang, Yun Dekker, Joannes Dartigues, Benjamin Mackie, Meaghan Evans, Miranda Macleod, Ruairidh Olsen, Jesper V. Collins, Matthew J. |
author_facet | Rodriguez Palomo, Ismael Nair, Bharath Chiang, Yun Dekker, Joannes Dartigues, Benjamin Mackie, Meaghan Evans, Miranda Macleod, Ruairidh Olsen, Jesper V. Collins, Matthew J. |
author_sort | Rodriguez Palomo, Ismael |
collection | DOAJ |
description | Palaeoproteomics is a rapidly evolving discipline, and practitioners are constantly developing novel strategies for the analyses and interpretations of complex, degraded protein mixtures. The community has also established standards of good practice to interrogate our data. However, there is a lack of a systematic exploration of how these affect the identification of peptides, post-translational modifications (PTMs), proteins and their significance (through the False Discovery Rate) and correctness. We systematically investigated the performance of a wide range of sequencing tools and search engines in a controlled system: the experimental degradation of the single purified bovine β-lactoglobulin (BLG), heated at 95°C and pH 7 for 0, 4 and 128 days. We target BLG since it is one of the most robust and ubiquitous proteins in the archaeological record. We tested different reference database choices, a targeted dairy protein one, and the whole bovine proteome and the three digestion options (tryptic-, semi-tryptic- and non-specific searches), in order to evaluate the effects of search space and the identification of peptides. We also explored alternative strategies, including open search that allows for the global identification of PTMs based upon wide precursor mass tolerance and de novo sequencing to boost sequence coverage. We analysed the samples using Mascot, MaxQuant, Metamorpheus, pFind, Fragpipe and DeNovoGUI (pepNovo+, DirecTag, Novor), benchmarked these tools and discuss the optimal strategy for the characterisation of ancient proteins. We also studied physicochemical properties of the BLG that correlate with bias in the identification coverage. |
format | Article |
id | doaj-art-20ec09271f00409ba25d7a744bc29b46 |
institution | Kabale University |
issn | 2804-3871 |
language | English |
publishDate | 2024-11-01 |
publisher | Peer Community In |
record_format | Article |
series | Peer Community Journal |
spelling | doaj-art-20ec09271f00409ba25d7a744bc29b462025-02-07T10:17:17ZengPeer Community InPeer Community Journal2804-38712024-11-01410.24072/pcjournal.49110.24072/pcjournal.491Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins Rodriguez Palomo, Ismael0https://orcid.org/0000-0001-5313-9709Nair, Bharath1https://orcid.org/0000-0002-1897-4132Chiang, Yun2https://orcid.org/0000-0003-3605-4054Dekker, Joannes3https://orcid.org/0000-0002-3952-4448Dartigues, Benjamin4https://orcid.org/0000-0003-1882-123XMackie, Meaghan5Evans, Miranda6https://orcid.org/0000-0002-9284-9268Macleod, Ruairidh7https://orcid.org/0000-0001-8086-4420Olsen, Jesper V.8https://orcid.org/0000-0002-4747-4938Collins, Matthew J.9https://orcid.org/0000-0003-4226-5501McDonald Institute for Archaeological Research, University of Cambridge, United KingdomMcDonald Institute for Archaeological Research, University of Cambridge, United Kingdom; Globe Institute, University of Copenhagen, DenmarkGlobe Institute, University of Copenhagen, Denmark; The Nice Institute of Chemistry, Université Côte d’Azur, Nice, FranceGlobe Institute, University of Copenhagen, Denmark; Department of Archaeology, University of York, United KingdomDepartment of Science and Technology, University of Bordeaux, FranceGlobe Institute, University of Copenhagen, Denmark; Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark; School of Archaeology, University College Dublin, Ireland; Archaeobiomics, Department of Life Sciences and Systems Biology, University of Turin, ItalyMcDonald Institute for Archaeological Research, University of Cambridge, United Kingdom; Department of Archaeology, University of York, United KingdomMcDonald Institute for Archaeological Research, University of Cambridge, United KingdomNovo Nordisk Foundation Center for Protein Research, University of Copenhagen, DenmarkMcDonald Institute for Archaeological Research, University of Cambridge, United Kingdom; Globe Institute, University of Copenhagen, DenmarkPalaeoproteomics is a rapidly evolving discipline, and practitioners are constantly developing novel strategies for the analyses and interpretations of complex, degraded protein mixtures. The community has also established standards of good practice to interrogate our data. However, there is a lack of a systematic exploration of how these affect the identification of peptides, post-translational modifications (PTMs), proteins and their significance (through the False Discovery Rate) and correctness. We systematically investigated the performance of a wide range of sequencing tools and search engines in a controlled system: the experimental degradation of the single purified bovine β-lactoglobulin (BLG), heated at 95°C and pH 7 for 0, 4 and 128 days. We target BLG since it is one of the most robust and ubiquitous proteins in the archaeological record. We tested different reference database choices, a targeted dairy protein one, and the whole bovine proteome and the three digestion options (tryptic-, semi-tryptic- and non-specific searches), in order to evaluate the effects of search space and the identification of peptides. We also explored alternative strategies, including open search that allows for the global identification of PTMs based upon wide precursor mass tolerance and de novo sequencing to boost sequence coverage. We analysed the samples using Mascot, MaxQuant, Metamorpheus, pFind, Fragpipe and DeNovoGUI (pepNovo+, DirecTag, Novor), benchmarked these tools and discuss the optimal strategy for the characterisation of ancient proteins. We also studied physicochemical properties of the BLG that correlate with bias in the identification coverage.https://peercommunityjournal.org/articles/10.24072/pcjournal.491/Palaeoproteomicsbeta-lactoglobulinFalse Discovery Ratebenchmarkingde novoopen search |
spellingShingle | Rodriguez Palomo, Ismael Nair, Bharath Chiang, Yun Dekker, Joannes Dartigues, Benjamin Mackie, Meaghan Evans, Miranda Macleod, Ruairidh Olsen, Jesper V. Collins, Matthew J. Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins Peer Community Journal Palaeoproteomics beta-lactoglobulin False Discovery Rate benchmarking de novo open search |
title | Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
|
title_full | Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
|
title_fullStr | Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
|
title_full_unstemmed | Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
|
title_short | Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
|
title_sort | benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins |
topic | Palaeoproteomics beta-lactoglobulin False Discovery Rate benchmarking de novo open search |
url | https://peercommunityjournal.org/articles/10.24072/pcjournal.491/ |
work_keys_str_mv | AT rodriguezpalomoismael benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins AT nairbharath benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins AT chiangyun benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins AT dekkerjoannes benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins AT dartiguesbenjamin benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins AT mackiemeaghan benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins AT evansmiranda benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins AT macleodruairidh benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins AT olsenjesperv benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins AT collinsmatthewj benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins |