ROASMI: accelerating small molecule identification by repurposing retention data

Abstract The limited replicability of retention data hinders its application in untargeted metabolomics for small molecule identification. While retention order models hold promise in addressing this issue, their predictive reliability is limited by uncertain generalizability. Here, we present the R...

Full description

Saved in:
Bibliographic Details
Main Authors: Fang-Yuan Sun, Ying-Hao Yin, Hui-Jun Liu, Lu-Na Shen, Xiu-Lin Kang, Gui-Zhong Xin, Li-Fang Liu, Jia-Yi Zheng
Format: Article
Language:English
Published: BMC 2025-02-01
Series:Journal of Cheminformatics
Subjects:
Online Access:https://doi.org/10.1186/s13321-025-00968-8
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850024968420589568
author Fang-Yuan Sun
Ying-Hao Yin
Hui-Jun Liu
Lu-Na Shen
Xiu-Lin Kang
Gui-Zhong Xin
Li-Fang Liu
Jia-Yi Zheng
author_facet Fang-Yuan Sun
Ying-Hao Yin
Hui-Jun Liu
Lu-Na Shen
Xiu-Lin Kang
Gui-Zhong Xin
Li-Fang Liu
Jia-Yi Zheng
author_sort Fang-Yuan Sun
collection DOAJ
description Abstract The limited replicability of retention data hinders its application in untargeted metabolomics for small molecule identification. While retention order models hold promise in addressing this issue, their predictive reliability is limited by uncertain generalizability. Here, we present the ROASMI model, which enables reliable prediction of retention order within a well-defined application domain by coupling data-driven molecular representation and mechanistic insights. The generalizability of ROASMI is proven by 71 independent reversed-phase liquid chromatography (RPLC) datasets. The application of ROASMI to four real-world datasets demonstrates its advantages in distinguishing coexisting isomers with similar fragmentation patterns and in annotating detection peaks without informative spectra. ROASMI is flexible enough to be retrained with user-defined reference sets and is compatible with other MS/MS scorers, making further improvements in small-molecule identification. 
format Article
id doaj-art-7cc5c6c80e664b2e9c8dd3eeb42d49eb
institution DOAJ
issn 1758-2946
language English
publishDate 2025-02-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj-art-7cc5c6c80e664b2e9c8dd3eeb42d49eb2025-08-20T03:00:58ZengBMCJournal of Cheminformatics1758-29462025-02-0117111510.1186/s13321-025-00968-8ROASMI: accelerating small molecule identification by repurposing retention dataFang-Yuan Sun0Ying-Hao Yin1Hui-Jun Liu2Lu-Na Shen3Xiu-Lin Kang4Gui-Zhong Xin5Li-Fang Liu6Jia-Yi Zheng7State Key Laboratory of Natural Medicines, Department of Chinese Medicines Analysis, School of Traditional Chinese Pharmacy, China Pharmaceutical UniversityState Key Laboratory of Natural Medicines, Department of Chinese Medicines Analysis, School of Traditional Chinese Pharmacy, China Pharmaceutical UniversityState Key Laboratory of Natural Medicines, Department of Chinese Medicines Analysis, School of Traditional Chinese Pharmacy, China Pharmaceutical UniversityState Key Laboratory of Natural Medicines, Department of Chinese Medicines Analysis, School of Traditional Chinese Pharmacy, China Pharmaceutical UniversityState Key Laboratory of Natural Medicines, Department of Chinese Medicines Analysis, School of Traditional Chinese Pharmacy, China Pharmaceutical UniversityState Key Laboratory of Natural Medicines, Department of Chinese Medicines Analysis, School of Traditional Chinese Pharmacy, China Pharmaceutical UniversityState Key Laboratory of Natural Medicines, Department of Chinese Medicines Analysis, School of Traditional Chinese Pharmacy, China Pharmaceutical UniversityState Key Laboratory of Natural Medicines, Department of Chinese Medicines Analysis, School of Traditional Chinese Pharmacy, China Pharmaceutical UniversityAbstract The limited replicability of retention data hinders its application in untargeted metabolomics for small molecule identification. While retention order models hold promise in addressing this issue, their predictive reliability is limited by uncertain generalizability. Here, we present the ROASMI model, which enables reliable prediction of retention order within a well-defined application domain by coupling data-driven molecular representation and mechanistic insights. The generalizability of ROASMI is proven by 71 independent reversed-phase liquid chromatography (RPLC) datasets. The application of ROASMI to four real-world datasets demonstrates its advantages in distinguishing coexisting isomers with similar fragmentation patterns and in annotating detection peaks without informative spectra. ROASMI is flexible enough to be retrained with user-defined reference sets and is compatible with other MS/MS scorers, making further improvements in small-molecule identification. https://doi.org/10.1186/s13321-025-00968-8MetabolomicsRetention orderSmall-molecule identificationReplicabilityDeep learning
spellingShingle Fang-Yuan Sun
Ying-Hao Yin
Hui-Jun Liu
Lu-Na Shen
Xiu-Lin Kang
Gui-Zhong Xin
Li-Fang Liu
Jia-Yi Zheng
ROASMI: accelerating small molecule identification by repurposing retention data
Journal of Cheminformatics
Metabolomics
Retention order
Small-molecule identification
Replicability
Deep learning
title ROASMI: accelerating small molecule identification by repurposing retention data
title_full ROASMI: accelerating small molecule identification by repurposing retention data
title_fullStr ROASMI: accelerating small molecule identification by repurposing retention data
title_full_unstemmed ROASMI: accelerating small molecule identification by repurposing retention data
title_short ROASMI: accelerating small molecule identification by repurposing retention data
title_sort roasmi accelerating small molecule identification by repurposing retention data
topic Metabolomics
Retention order
Small-molecule identification
Replicability
Deep learning
url https://doi.org/10.1186/s13321-025-00968-8
work_keys_str_mv AT fangyuansun roasmiacceleratingsmallmoleculeidentificationbyrepurposingretentiondata
AT yinghaoyin roasmiacceleratingsmallmoleculeidentificationbyrepurposingretentiondata
AT huijunliu roasmiacceleratingsmallmoleculeidentificationbyrepurposingretentiondata
AT lunashen roasmiacceleratingsmallmoleculeidentificationbyrepurposingretentiondata
AT xiulinkang roasmiacceleratingsmallmoleculeidentificationbyrepurposingretentiondata
AT guizhongxin roasmiacceleratingsmallmoleculeidentificationbyrepurposingretentiondata
AT lifangliu roasmiacceleratingsmallmoleculeidentificationbyrepurposingretentiondata
AT jiayizheng roasmiacceleratingsmallmoleculeidentificationbyrepurposingretentiondata