MassCube improves accuracy for metabolomics data processing from raw files to phenotype classifiers
Abstract Nontargeted peak detection in LC-MS-based metabolomics must become robust and benchmarked. We present MassCube, a Python-based open-source framework for MS data processing that we systematically benchmark against other algorithms and different types of input data. From raw data, peaks are d...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Nature Communications |
| Online Access: | https://doi.org/10.1038/s41467-025-60640-5 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849768615250755584 |
|---|---|
| author | Huaxu Yu Jun Ding Tong Shen Min Liu Yuanyue Li Oliver Fiehn |
| author_facet | Huaxu Yu Jun Ding Tong Shen Min Liu Yuanyue Li Oliver Fiehn |
| author_sort | Huaxu Yu |
| collection | DOAJ |
| description | Abstract Nontargeted peak detection in LC-MS-based metabolomics must become robust and benchmarked. We present MassCube, a Python-based open-source framework for MS data processing that we systematically benchmark against other algorithms and different types of input data. From raw data, peaks are detected by constructing mass traces through signal clustering and Gaussian-filter assisted edge detection. Peaks are then grouped for adduct and in-source fragment detection, and compounds are annotated by both identity- and fuzzy searches. Final data tables undergo quality controls and can be used for metabolome-informed phenotype prediction. Peak detection in MassCube achieves 100% signal coverage with comprehensive reporting of chromatographic metadata for quality assurance. MassCube outperforms MS-DIAL, MZmine3 or XCMS for speed, isomer detection, and accuracy. It supports diverse numerical routines for MS data analysis while maintaining efficiency, capable for handling 105 GB of Astral MS data on a laptop within 64 min, while other programs took 8–24 times longer. MassCube automatically detected age, sex and regional differences when applied to the Metabolome Atlas of the Aging Mouse Brain data despite batch effects. MassCube is available at https://github.com/huaxuyu/masscube for direct use or implementation into larger applications in omics or biomedical research. |
| format | Article |
| id | doaj-art-be356e228bb6474c887ff99f6e6accb1 |
| institution | DOAJ |
| issn | 2041-1723 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Nature Communications |
| spelling | doaj-art-be356e228bb6474c887ff99f6e6accb12025-08-20T03:03:44ZengNature PortfolioNature Communications2041-17232025-07-0116111510.1038/s41467-025-60640-5MassCube improves accuracy for metabolomics data processing from raw files to phenotype classifiersHuaxu Yu0Jun Ding1Tong Shen2Min Liu3Yuanyue Li4Oliver Fiehn5West Coast Metabolomics Center, University of California DavisChina CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of SciencesWest Coast Metabolomics Center, University of California DavisWest Coast Metabolomics Center, University of California DavisWest Coast Metabolomics Center, University of California DavisWest Coast Metabolomics Center, University of California DavisAbstract Nontargeted peak detection in LC-MS-based metabolomics must become robust and benchmarked. We present MassCube, a Python-based open-source framework for MS data processing that we systematically benchmark against other algorithms and different types of input data. From raw data, peaks are detected by constructing mass traces through signal clustering and Gaussian-filter assisted edge detection. Peaks are then grouped for adduct and in-source fragment detection, and compounds are annotated by both identity- and fuzzy searches. Final data tables undergo quality controls and can be used for metabolome-informed phenotype prediction. Peak detection in MassCube achieves 100% signal coverage with comprehensive reporting of chromatographic metadata for quality assurance. MassCube outperforms MS-DIAL, MZmine3 or XCMS for speed, isomer detection, and accuracy. It supports diverse numerical routines for MS data analysis while maintaining efficiency, capable for handling 105 GB of Astral MS data on a laptop within 64 min, while other programs took 8–24 times longer. MassCube automatically detected age, sex and regional differences when applied to the Metabolome Atlas of the Aging Mouse Brain data despite batch effects. MassCube is available at https://github.com/huaxuyu/masscube for direct use or implementation into larger applications in omics or biomedical research.https://doi.org/10.1038/s41467-025-60640-5 |
| spellingShingle | Huaxu Yu Jun Ding Tong Shen Min Liu Yuanyue Li Oliver Fiehn MassCube improves accuracy for metabolomics data processing from raw files to phenotype classifiers Nature Communications |
| title | MassCube improves accuracy for metabolomics data processing from raw files to phenotype classifiers |
| title_full | MassCube improves accuracy for metabolomics data processing from raw files to phenotype classifiers |
| title_fullStr | MassCube improves accuracy for metabolomics data processing from raw files to phenotype classifiers |
| title_full_unstemmed | MassCube improves accuracy for metabolomics data processing from raw files to phenotype classifiers |
| title_short | MassCube improves accuracy for metabolomics data processing from raw files to phenotype classifiers |
| title_sort | masscube improves accuracy for metabolomics data processing from raw files to phenotype classifiers |
| url | https://doi.org/10.1038/s41467-025-60640-5 |
| work_keys_str_mv | AT huaxuyu masscubeimprovesaccuracyformetabolomicsdataprocessingfromrawfilestophenotypeclassifiers AT junding masscubeimprovesaccuracyformetabolomicsdataprocessingfromrawfilestophenotypeclassifiers AT tongshen masscubeimprovesaccuracyformetabolomicsdataprocessingfromrawfilestophenotypeclassifiers AT minliu masscubeimprovesaccuracyformetabolomicsdataprocessingfromrawfilestophenotypeclassifiers AT yuanyueli masscubeimprovesaccuracyformetabolomicsdataprocessingfromrawfilestophenotypeclassifiers AT oliverfiehn masscubeimprovesaccuracyformetabolomicsdataprocessingfromrawfilestophenotypeclassifiers |