Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data
Abstract Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (< 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characteri...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-05-01
|
| Series: | Journal of Cheminformatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s13321-025-01031-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849687970522595328 |
|---|---|
| author | Pilleriin Peets Aristeidis Litos Kai Dührkop Daniel R. Garza Justin J. J. van der Hooft Sebastian Böcker Bas E. Dutilh |
| author_facet | Pilleriin Peets Aristeidis Litos Kai Dührkop Daniel R. Garza Justin J. J. van der Hooft Sebastian Böcker Bas E. Dutilh |
| author_sort | Pilleriin Peets |
| collection | DOAJ |
| description | Abstract Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (< 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characterize compounds and samples. These chemical characteristics vectors (CCVs) estimate the fraction of compounds with specific chemical properties in a sample. Unlike the aligned MS1 data with intensity information, CCVs incorporate the chemical properties of compounds, allowing chemical annotation to be used for sample comparison. Thus, we identified compound classes differentiating biomes, such as ethers which are enriched in environmental biomes, while steroids enriched in animal host-related biomes. In biomes with greater variability, CCVs revealed key clustering compound classes, such as organonitrogen compounds in animal distal gut and lipids in animal secretions. CCVs thus enhance the interpretation of untargeted metabolomic data, providing a quantifiable and generalizable understanding of the chemical space of natural biomes. Graphical Abstract |
| format | Article |
| id | doaj-art-bc3931fca0314746bcdb970acfbd96f5 |
| institution | DOAJ |
| issn | 1758-2946 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | BMC |
| record_format | Article |
| series | Journal of Cheminformatics |
| spelling | doaj-art-bc3931fca0314746bcdb970acfbd96f52025-08-20T03:22:11ZengBMCJournal of Cheminformatics1758-29462025-05-0117111710.1186/s13321-025-01031-2Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry dataPilleriin Peets0Aristeidis Litos1Kai Dührkop2Daniel R. Garza3Justin J. J. van der Hooft4Sebastian Böcker5Bas E. Dutilh6Institute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller UniversityInstitute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller UniversityChair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University JenaINRAE, PROSE, Université Paris-SaclayBioinformatics Group, Wageningen University & ResearchChair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University JenaInstitute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller UniversityAbstract Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (< 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characterize compounds and samples. These chemical characteristics vectors (CCVs) estimate the fraction of compounds with specific chemical properties in a sample. Unlike the aligned MS1 data with intensity information, CCVs incorporate the chemical properties of compounds, allowing chemical annotation to be used for sample comparison. Thus, we identified compound classes differentiating biomes, such as ethers which are enriched in environmental biomes, while steroids enriched in animal host-related biomes. In biomes with greater variability, CCVs revealed key clustering compound classes, such as organonitrogen compounds in animal distal gut and lipids in animal secretions. CCVs thus enhance the interpretation of untargeted metabolomic data, providing a quantifiable and generalizable understanding of the chemical space of natural biomes. Graphical Abstracthttps://doi.org/10.1186/s13321-025-01031-2Untargeted metabolomicsNontargeted screeningMass spectrometryBioinformaticsCheminformaticsEarth microbiome |
| spellingShingle | Pilleriin Peets Aristeidis Litos Kai Dührkop Daniel R. Garza Justin J. J. van der Hooft Sebastian Böcker Bas E. Dutilh Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data Journal of Cheminformatics Untargeted metabolomics Nontargeted screening Mass spectrometry Bioinformatics Cheminformatics Earth microbiome |
| title | Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data |
| title_full | Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data |
| title_fullStr | Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data |
| title_full_unstemmed | Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data |
| title_short | Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data |
| title_sort | chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data |
| topic | Untargeted metabolomics Nontargeted screening Mass spectrometry Bioinformatics Cheminformatics Earth microbiome |
| url | https://doi.org/10.1186/s13321-025-01031-2 |
| work_keys_str_mv | AT pilleriinpeets chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata AT aristeidislitos chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata AT kaiduhrkop chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata AT danielrgarza chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata AT justinjjvanderhooft chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata AT sebastianbocker chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata AT basedutilh chemicalcharacteristicsvectorsmapthechemicalspaceofnaturalbiomesfromuntargetedmassspectrometrydata |