Enabling pan-repository reanalysis for big data science of public metabolomics data
Abstract Public untargeted metabolomics data is a growing resource for metabolite and phenotype discovery; however, accessing and utilizing these data across repositories pose significant challenges. Therefore, here we develop pan-repository universal identifiers and harmonized cross-repository meta...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | Nature Communications |
| Online Access: | https://doi.org/10.1038/s41467-025-60067-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850124724218101760 |
|---|---|
| author | Yasin El Abiead Michael Strobel Thomas Payne Eoin Fahy Claire O’Donovan Shankar Subramamiam Juan Antonio Vizcaíno Ozgur Yurekten Victoria Deleray Simone Zuffa Shipei Xing Helena Mannochio-Russo Ipsita Mohanty Haoqi Nina Zhao Andres M. Caraballo-Rodriguez Paulo Wender P. Gomes Nicole E. Avalon Trent R. Northen Benjamin P. Bowen Katherine B. Louie Pieter C. Dorrestein Mingxun Wang |
| author_facet | Yasin El Abiead Michael Strobel Thomas Payne Eoin Fahy Claire O’Donovan Shankar Subramamiam Juan Antonio Vizcaíno Ozgur Yurekten Victoria Deleray Simone Zuffa Shipei Xing Helena Mannochio-Russo Ipsita Mohanty Haoqi Nina Zhao Andres M. Caraballo-Rodriguez Paulo Wender P. Gomes Nicole E. Avalon Trent R. Northen Benjamin P. Bowen Katherine B. Louie Pieter C. Dorrestein Mingxun Wang |
| author_sort | Yasin El Abiead |
| collection | DOAJ |
| description | Abstract Public untargeted metabolomics data is a growing resource for metabolite and phenotype discovery; however, accessing and utilizing these data across repositories pose significant challenges. Therefore, here we develop pan-repository universal identifiers and harmonized cross-repository metadata. This ecosystem facilitates discovery by integrating diverse data sources from public repositories including MetaboLights, Metabolomics Workbench, and GNPS/MassIVE. Our approach simplified data handling and unlocks previously inaccessible reanalysis workflows, fostering unmatched research opportunities. |
| format | Article |
| id | doaj-art-1879a9528dd94a30b698df6037cfb56e |
| institution | OA Journals |
| issn | 2041-1723 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Nature Communications |
| spelling | doaj-art-1879a9528dd94a30b698df6037cfb56e2025-08-20T02:34:15ZengNature PortfolioNature Communications2041-17232025-05-011611710.1038/s41467-025-60067-yEnabling pan-repository reanalysis for big data science of public metabolomics dataYasin El Abiead0Michael Strobel1Thomas Payne2Eoin Fahy3Claire O’Donovan4Shankar Subramamiam5Juan Antonio Vizcaíno6Ozgur Yurekten7Victoria Deleray8Simone Zuffa9Shipei Xing10Helena Mannochio-Russo11Ipsita Mohanty12Haoqi Nina Zhao13Andres M. Caraballo-Rodriguez14Paulo Wender P. Gomes15Nicole E. Avalon16Trent R. Northen17Benjamin P. Bowen18Katherine B. Louie19Pieter C. Dorrestein20Mingxun Wang21Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoDepartment of Computer Science and Engineering, University of California RiversideEuropean Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, HinxtonDepartment of Bioengineering, and San Diego Supercomputer Center, University of California, San DiegoEuropean Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, HinxtonDepartment of Bioengineering, and San Diego Supercomputer Center, University of California, San DiegoEuropean Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, HinxtonEuropean Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, HinxtonSkaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoSkaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoSkaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoSkaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoSkaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoSkaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoSkaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoSkaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoCenter for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San DiegoEnvironmental Genomics and Systems Biology Division, Lawrence Berkeley National LabEnvironmental Genomics and Systems Biology Division, Lawrence Berkeley National LabThe DOE Joint Genome Institute, Lawrence Berkeley National LaboratorySkaggs School of Pharmacy and Pharmaceutical Sciences, University of California San DiegoDepartment of Computer Science and Engineering, University of California RiversideAbstract Public untargeted metabolomics data is a growing resource for metabolite and phenotype discovery; however, accessing and utilizing these data across repositories pose significant challenges. Therefore, here we develop pan-repository universal identifiers and harmonized cross-repository metadata. This ecosystem facilitates discovery by integrating diverse data sources from public repositories including MetaboLights, Metabolomics Workbench, and GNPS/MassIVE. Our approach simplified data handling and unlocks previously inaccessible reanalysis workflows, fostering unmatched research opportunities.https://doi.org/10.1038/s41467-025-60067-y |
| spellingShingle | Yasin El Abiead Michael Strobel Thomas Payne Eoin Fahy Claire O’Donovan Shankar Subramamiam Juan Antonio Vizcaíno Ozgur Yurekten Victoria Deleray Simone Zuffa Shipei Xing Helena Mannochio-Russo Ipsita Mohanty Haoqi Nina Zhao Andres M. Caraballo-Rodriguez Paulo Wender P. Gomes Nicole E. Avalon Trent R. Northen Benjamin P. Bowen Katherine B. Louie Pieter C. Dorrestein Mingxun Wang Enabling pan-repository reanalysis for big data science of public metabolomics data Nature Communications |
| title | Enabling pan-repository reanalysis for big data science of public metabolomics data |
| title_full | Enabling pan-repository reanalysis for big data science of public metabolomics data |
| title_fullStr | Enabling pan-repository reanalysis for big data science of public metabolomics data |
| title_full_unstemmed | Enabling pan-repository reanalysis for big data science of public metabolomics data |
| title_short | Enabling pan-repository reanalysis for big data science of public metabolomics data |
| title_sort | enabling pan repository reanalysis for big data science of public metabolomics data |
| url | https://doi.org/10.1038/s41467-025-60067-y |
| work_keys_str_mv | AT yasinelabiead enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT michaelstrobel enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT thomaspayne enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT eoinfahy enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT claireodonovan enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT shankarsubramamiam enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT juanantoniovizcaino enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT ozguryurekten enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT victoriadeleray enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT simonezuffa enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT shipeixing enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT helenamannochiorusso enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT ipsitamohanty enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT haoqininazhao enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT andresmcaraballorodriguez enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT paulowenderpgomes enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT nicoleeavalon enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT trentrnorthen enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT benjaminpbowen enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT katherineblouie enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT pietercdorrestein enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata AT mingxunwang enablingpanrepositoryreanalysisforbigdatascienceofpublicmetabolomicsdata |