Learning motif features and topological structure of molecules for metabolic pathway prediction

Abstract Metabolites serve as crucial biomarkers for assessing disease progression and understanding underlying pathogenic mechanisms. However, when the metabolic pathway category of metabolites is unknown, researchers face challenges in conducting metabolomic analyses. Due to the complexity of wet...

Full description

Saved in:
Bibliographic Details
Main Authors: Jianguo Hu, Yiqing Zhang, Jinxin Xie, Zhen Yuan, Zhangxiang Yin, Shanshan Shi, Honglin Li, Shiliang Li
Format: Article
Language:English
Published: BMC 2025-04-01
Series:Journal of Cheminformatics
Online Access:https://doi.org/10.1186/s13321-025-00994-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849311065130663936
author Jianguo Hu
Yiqing Zhang
Jinxin Xie
Zhen Yuan
Zhangxiang Yin
Shanshan Shi
Honglin Li
Shiliang Li
author_facet Jianguo Hu
Yiqing Zhang
Jinxin Xie
Zhen Yuan
Zhangxiang Yin
Shanshan Shi
Honglin Li
Shiliang Li
author_sort Jianguo Hu
collection DOAJ
description Abstract Metabolites serve as crucial biomarkers for assessing disease progression and understanding underlying pathogenic mechanisms. However, when the metabolic pathway category of metabolites is unknown, researchers face challenges in conducting metabolomic analyses. Due to the complexity of wet laboratory experimentation for pathway identification, there is a growing demand for predictive methods. Various computational approaches, including machine learning and graph neural networks, have been proposed; however, interpretability remains a challenge. We have developed a neural network framework called MotifMol3D, which is designed for predicting molecular metabolic pathway categories. This framework introduces motif information to mine local features of small-sample molecules, combining with graph neural network and 3D information to complete the prediction task. Using a dataset of 5,698 molecules that participate in 11 metabolic pathway categories in the KEGG database, MotifMol3D outperformed state-of-the-art methods in precision, recall, and F1 score. In addition, ablation study and motif analysis have demonstrated the effectiveness and usefulness of the model. Motif analysis, in particular, has shown motif information can actually characterize the main features of specific pathway molecules to a certain extent and enhance the interpretability of the model. An external validation further corroborates this observation. MotifMol3D is an open-source tool that is available at https://github.com/Irena-Zhang/MotifMol3D.git . Scientific contribution MotifMol3D integrates motif information, graph neural networks, and 3D structural data to enhance feature extraction for small-sample molecules, improving the precision and interpretability of metabolic pathway predictions. The model outperforms state-of-the-art approaches in precision, recall, and F1 score. This work reveals how motif information characterizes pathway-specific molecules, offering novel insights into molecular properties within metabolic pathways.
format Article
id doaj-art-d2f18ae56b8e4a1784d8618c9e2bcc69
institution Kabale University
issn 1758-2946
language English
publishDate 2025-04-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj-art-d2f18ae56b8e4a1784d8618c9e2bcc692025-08-20T03:53:32ZengBMCJournal of Cheminformatics1758-29462025-04-0117111410.1186/s13321-025-00994-6Learning motif features and topological structure of molecules for metabolic pathway predictionJianguo Hu0Yiqing Zhang1Jinxin Xie2Zhen Yuan3Zhangxiang Yin4Shanshan Shi5Honglin Li6Shiliang Li7Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and TechnologyShanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and TechnologyShanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and TechnologyShanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and TechnologyShanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and TechnologyShanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and TechnologyShanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and TechnologyShanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and TechnologyAbstract Metabolites serve as crucial biomarkers for assessing disease progression and understanding underlying pathogenic mechanisms. However, when the metabolic pathway category of metabolites is unknown, researchers face challenges in conducting metabolomic analyses. Due to the complexity of wet laboratory experimentation for pathway identification, there is a growing demand for predictive methods. Various computational approaches, including machine learning and graph neural networks, have been proposed; however, interpretability remains a challenge. We have developed a neural network framework called MotifMol3D, which is designed for predicting molecular metabolic pathway categories. This framework introduces motif information to mine local features of small-sample molecules, combining with graph neural network and 3D information to complete the prediction task. Using a dataset of 5,698 molecules that participate in 11 metabolic pathway categories in the KEGG database, MotifMol3D outperformed state-of-the-art methods in precision, recall, and F1 score. In addition, ablation study and motif analysis have demonstrated the effectiveness and usefulness of the model. Motif analysis, in particular, has shown motif information can actually characterize the main features of specific pathway molecules to a certain extent and enhance the interpretability of the model. An external validation further corroborates this observation. MotifMol3D is an open-source tool that is available at https://github.com/Irena-Zhang/MotifMol3D.git . Scientific contribution MotifMol3D integrates motif information, graph neural networks, and 3D structural data to enhance feature extraction for small-sample molecules, improving the precision and interpretability of metabolic pathway predictions. The model outperforms state-of-the-art approaches in precision, recall, and F1 score. This work reveals how motif information characterizes pathway-specific molecules, offering novel insights into molecular properties within metabolic pathways.https://doi.org/10.1186/s13321-025-00994-6
spellingShingle Jianguo Hu
Yiqing Zhang
Jinxin Xie
Zhen Yuan
Zhangxiang Yin
Shanshan Shi
Honglin Li
Shiliang Li
Learning motif features and topological structure of molecules for metabolic pathway prediction
Journal of Cheminformatics
title Learning motif features and topological structure of molecules for metabolic pathway prediction
title_full Learning motif features and topological structure of molecules for metabolic pathway prediction
title_fullStr Learning motif features and topological structure of molecules for metabolic pathway prediction
title_full_unstemmed Learning motif features and topological structure of molecules for metabolic pathway prediction
title_short Learning motif features and topological structure of molecules for metabolic pathway prediction
title_sort learning motif features and topological structure of molecules for metabolic pathway prediction
url https://doi.org/10.1186/s13321-025-00994-6
work_keys_str_mv AT jianguohu learningmotiffeaturesandtopologicalstructureofmoleculesformetabolicpathwayprediction
AT yiqingzhang learningmotiffeaturesandtopologicalstructureofmoleculesformetabolicpathwayprediction
AT jinxinxie learningmotiffeaturesandtopologicalstructureofmoleculesformetabolicpathwayprediction
AT zhenyuan learningmotiffeaturesandtopologicalstructureofmoleculesformetabolicpathwayprediction
AT zhangxiangyin learningmotiffeaturesandtopologicalstructureofmoleculesformetabolicpathwayprediction
AT shanshanshi learningmotiffeaturesandtopologicalstructureofmoleculesformetabolicpathwayprediction
AT honglinli learningmotiffeaturesandtopologicalstructureofmoleculesformetabolicpathwayprediction
AT shiliangli learningmotiffeaturesandtopologicalstructureofmoleculesformetabolicpathwayprediction