Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data

Abstract Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (< 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characteri...

Full description

Saved in:
Bibliographic Details
Main Authors: Pilleriin Peets, Aristeidis Litos, Kai Dührkop, Daniel R. Garza, Justin J. J. van der Hooft, Sebastian Böcker, Bas E. Dutilh
Format: Article
Language:English
Published: BMC 2025-05-01
Series:Journal of Cheminformatics
Subjects:
Online Access:https://doi.org/10.1186/s13321-025-01031-2
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849687970522595328
author Pilleriin Peets
Aristeidis Litos
Kai Dührkop
Daniel R. Garza
Justin J. J. van der Hooft
Sebastian Böcker
Bas E. Dutilh
author_facet Pilleriin Peets
Aristeidis Litos
Kai Dührkop
Daniel R. Garza
Justin J. J. van der Hooft
Sebastian Böcker
Bas E. Dutilh
author_sort Pilleriin Peets
collection DOAJ
description Abstract Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (< 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characterize compounds and samples. These chemical characteristics vectors (CCVs) estimate the fraction of compounds with specific chemical properties in a sample. Unlike the aligned MS1 data with intensity information, CCVs incorporate the chemical properties of compounds, allowing chemical annotation to be used for sample comparison. Thus, we identified compound classes differentiating biomes, such as ethers which are enriched in environmental biomes, while steroids enriched in animal host-related biomes. In biomes with greater variability, CCVs revealed key clustering compound classes, such as organonitrogen compounds in animal distal gut and lipids in animal secretions. CCVs thus enhance the interpretation of untargeted metabolomic data, providing a quantifiable and generalizable understanding of the chemical space of natural biomes. Graphical Abstract
format Article
id doaj-art-bc3931fca0314746bcdb970acfbd96f5
institution DOAJ
issn 1758-2946
language English
publishDate 2025-05-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj-art-bc3931fca0314746bcdb970acfbd96f52025-08-20T03:22:11ZengBMCJournal of Cheminformatics1758-29462025-05-0117111710.1186/s13321-025-01031-2Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry dataPilleriin Peets0Aristeidis Litos1Kai Dührkop2Daniel R. Garza3Justin J. J. van der Hooft4Sebastian Böcker5Bas E. Dutilh6Institute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller UniversityInstitute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller UniversityChair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University JenaINRAE, PROSE, Université Paris-SaclayBioinformatics Group, Wageningen University & ResearchChair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University JenaInstitute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller UniversityAbstract Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (< 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characterize compounds and samples. These chemical characteristics vectors (CCVs) estimate the fraction of compounds with specific chemical properties in a sample. Unlike the aligned MS1 data with intensity information, CCVs incorporate the chemical properties of compounds, allowing chemical annotation to be used for sample comparison. Thus, we identified compound classes differentiating biomes, such as ethers which are enriched in environmental biomes, while steroids enriched in animal host-related biomes. In biomes with greater variability, CCVs revealed key clustering compound classes, such as organonitrogen compounds in animal distal gut and lipids in animal secretions. CCVs thus enhance the interpretation of untargeted metabolomic data, providing a quantifiable and generalizable understanding of the chemical space of natural biomes. Graphical Abstracthttps://doi.org/10.1186/s13321-025-01031-2Untargeted metabolomicsNontargeted screeningMass spectrometryBioinformaticsCheminformaticsEarth microbiome
spellingShingle Pilleriin Peets
Aristeidis Litos
Kai Dührkop
Daniel R. Garza
Justin J. J. van der Hooft
Sebastian Böcker
Bas E. Dutilh
Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data
Journal of Cheminformatics
Untargeted metabolomics
Nontargeted screening
Mass spectrometry
Bioinformatics
Cheminformatics
Earth microbiome
title Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data
title_full Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data
title_fullStr Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data
title_full_unstemmed Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data
title_short Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data
title_sort chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data
topic Untargeted metabolomics
Nontargeted screening
Mass spectrometry
Bioinformatics
Cheminformatics
Earth microbiome
url https://doi.org/10.1186/s13321-025-01031-2
work_keys_str_mv AT pilleriinpeets chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata
AT aristeidislitos chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata
AT kaiduhrkop chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata
AT danielrgarza chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata
AT justinjjvanderhooft chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata
AT sebastianbocker chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata
AT basedutilh chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata