Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins

Palaeoproteomics is a rapidly evolving discipline, and practitioners are constantly developing novel strategies for the analyses and interpretations of complex, degraded protein mixtures. The community has also established standards of good practice to interrogate our data. However, there is a lack...

Full description

Saved in:
Bibliographic Details
Main Authors: Rodriguez Palomo, Ismael, Nair, Bharath, Chiang, Yun, Dekker, Joannes, Dartigues, Benjamin, Mackie, Meaghan, Evans, Miranda, Macleod, Ruairidh, Olsen, Jesper V., Collins, Matthew J.
Format: Article
Language:English
Published: Peer Community In 2024-11-01
Series:Peer Community Journal
Subjects:
Online Access:https://peercommunityjournal.org/articles/10.24072/pcjournal.491/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825206372430512128
author Rodriguez Palomo, Ismael
Nair, Bharath
Chiang, Yun
Dekker, Joannes
Dartigues, Benjamin
Mackie, Meaghan
Evans, Miranda
Macleod, Ruairidh
Olsen, Jesper V.
Collins, Matthew J.
author_facet Rodriguez Palomo, Ismael
Nair, Bharath
Chiang, Yun
Dekker, Joannes
Dartigues, Benjamin
Mackie, Meaghan
Evans, Miranda
Macleod, Ruairidh
Olsen, Jesper V.
Collins, Matthew J.
author_sort Rodriguez Palomo, Ismael
collection DOAJ
description Palaeoproteomics is a rapidly evolving discipline, and practitioners are constantly developing novel strategies for the analyses and interpretations of complex, degraded protein mixtures. The community has also established standards of good practice to interrogate our data. However, there is a lack of a systematic exploration of how these affect the identification of peptides, post-translational modifications (PTMs), proteins and their significance (through the False Discovery Rate) and correctness. We systematically investigated the performance of a wide range of sequencing tools and search engines in a controlled system: the experimental degradation of the single purified bovine β-lactoglobulin (BLG), heated at 95°C and pH 7 for 0, 4 and 128 days. We target BLG since it is one of the most robust and ubiquitous proteins in the archaeological record. We tested different reference database choices, a targeted dairy protein one, and the whole bovine proteome and the three digestion options (tryptic-, semi-tryptic- and non-specific searches), in order to evaluate the effects of search space and the identification of peptides. We also explored alternative strategies, including open search that allows for the global identification of PTMs based upon wide precursor mass tolerance and de novo sequencing to boost sequence coverage. We analysed the samples using Mascot, MaxQuant, Metamorpheus, pFind, Fragpipe and DeNovoGUI (pepNovo+, DirecTag, Novor), benchmarked these tools and discuss the optimal strategy for the characterisation of ancient proteins. We also studied physicochemical properties of the BLG that correlate with bias in the identification coverage.
format Article
id doaj-art-20ec09271f00409ba25d7a744bc29b46
institution Kabale University
issn 2804-3871
language English
publishDate 2024-11-01
publisher Peer Community In
record_format Article
series Peer Community Journal
spelling doaj-art-20ec09271f00409ba25d7a744bc29b462025-02-07T10:17:17ZengPeer Community InPeer Community Journal2804-38712024-11-01410.24072/pcjournal.49110.24072/pcjournal.491Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins Rodriguez Palomo, Ismael0https://orcid.org/0000-0001-5313-9709Nair, Bharath1https://orcid.org/0000-0002-1897-4132Chiang, Yun2https://orcid.org/0000-0003-3605-4054Dekker, Joannes3https://orcid.org/0000-0002-3952-4448Dartigues, Benjamin4https://orcid.org/0000-0003-1882-123XMackie, Meaghan5Evans, Miranda6https://orcid.org/0000-0002-9284-9268Macleod, Ruairidh7https://orcid.org/0000-0001-8086-4420Olsen, Jesper V.8https://orcid.org/0000-0002-4747-4938Collins, Matthew J.9https://orcid.org/0000-0003-4226-5501McDonald Institute for Archaeological Research, University of Cambridge, United KingdomMcDonald Institute for Archaeological Research, University of Cambridge, United Kingdom; Globe Institute, University of Copenhagen, DenmarkGlobe Institute, University of Copenhagen, Denmark; The Nice Institute of Chemistry, Université Côte d’Azur, Nice, FranceGlobe Institute, University of Copenhagen, Denmark; Department of Archaeology, University of York, United KingdomDepartment of Science and Technology, University of Bordeaux, FranceGlobe Institute, University of Copenhagen, Denmark; Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark; School of Archaeology, University College Dublin, Ireland; Archaeobiomics, Department of Life Sciences and Systems Biology, University of Turin, ItalyMcDonald Institute for Archaeological Research, University of Cambridge, United Kingdom; Department of Archaeology, University of York, United KingdomMcDonald Institute for Archaeological Research, University of Cambridge, United KingdomNovo Nordisk Foundation Center for Protein Research, University of Copenhagen, DenmarkMcDonald Institute for Archaeological Research, University of Cambridge, United Kingdom; Globe Institute, University of Copenhagen, DenmarkPalaeoproteomics is a rapidly evolving discipline, and practitioners are constantly developing novel strategies for the analyses and interpretations of complex, degraded protein mixtures. The community has also established standards of good practice to interrogate our data. However, there is a lack of a systematic exploration of how these affect the identification of peptides, post-translational modifications (PTMs), proteins and their significance (through the False Discovery Rate) and correctness. We systematically investigated the performance of a wide range of sequencing tools and search engines in a controlled system: the experimental degradation of the single purified bovine β-lactoglobulin (BLG), heated at 95°C and pH 7 for 0, 4 and 128 days. We target BLG since it is one of the most robust and ubiquitous proteins in the archaeological record. We tested different reference database choices, a targeted dairy protein one, and the whole bovine proteome and the three digestion options (tryptic-, semi-tryptic- and non-specific searches), in order to evaluate the effects of search space and the identification of peptides. We also explored alternative strategies, including open search that allows for the global identification of PTMs based upon wide precursor mass tolerance and de novo sequencing to boost sequence coverage. We analysed the samples using Mascot, MaxQuant, Metamorpheus, pFind, Fragpipe and DeNovoGUI (pepNovo+, DirecTag, Novor), benchmarked these tools and discuss the optimal strategy for the characterisation of ancient proteins. We also studied physicochemical properties of the BLG that correlate with bias in the identification coverage.https://peercommunityjournal.org/articles/10.24072/pcjournal.491/Palaeoproteomicsbeta-lactoglobulinFalse Discovery Ratebenchmarkingde novoopen search
spellingShingle Rodriguez Palomo, Ismael
Nair, Bharath
Chiang, Yun
Dekker, Joannes
Dartigues, Benjamin
Mackie, Meaghan
Evans, Miranda
Macleod, Ruairidh
Olsen, Jesper V.
Collins, Matthew J.
Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
Peer Community Journal
Palaeoproteomics
beta-lactoglobulin
False Discovery Rate
benchmarking
de novo
open search
title Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
title_full Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
title_fullStr Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
title_full_unstemmed Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
title_short Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
title_sort benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
topic Palaeoproteomics
beta-lactoglobulin
False Discovery Rate
benchmarking
de novo
open search
url https://peercommunityjournal.org/articles/10.24072/pcjournal.491/
work_keys_str_mv AT rodriguezpalomoismael benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins
AT nairbharath benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins
AT chiangyun benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins
AT dekkerjoannes benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins
AT dartiguesbenjamin benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins
AT mackiemeaghan benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins
AT evansmiranda benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins
AT macleodruairidh benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins
AT olsenjesperv benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins
AT collinsmatthewj benchmarkingtheidentificationofasingledegradedproteintoexploreoptimalsearchstrategiesforancientproteins