A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule Metabolome

<b>Background/Objectives:</b> Respiratory viruses, including Influenza, RSV, and COVID-19, cause various respiratory infections. Distinguishing these viruses relies on diagnostic methods such as PCR testing. Challenges stem from overlapping symptoms and the emergence of new strains. Adva...

Full description

Saved in:
Bibliographic Details
Main Authors: Md. Shaheenur Islam Sumon, Md Sakib Abrar Hossain, Haya Al-Sulaiti, Hadi M. Yassine, Muhammad E. H. Chowdhury
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Metabolites
Subjects:
Online Access:https://www.mdpi.com/2218-1989/15/1/44
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832587978232299520
author Md. Shaheenur Islam Sumon
Md Sakib Abrar Hossain
Haya Al-Sulaiti
Hadi M. Yassine
Muhammad E. H. Chowdhury
author_facet Md. Shaheenur Islam Sumon
Md Sakib Abrar Hossain
Haya Al-Sulaiti
Hadi M. Yassine
Muhammad E. H. Chowdhury
author_sort Md. Shaheenur Islam Sumon
collection DOAJ
description <b>Background/Objectives:</b> Respiratory viruses, including Influenza, RSV, and COVID-19, cause various respiratory infections. Distinguishing these viruses relies on diagnostic methods such as PCR testing. Challenges stem from overlapping symptoms and the emergence of new strains. Advanced diagnostics are crucial for accurate detection and effective management. This study leveraged nasopharyngeal metabolome data to predict respiratory virus scenarios including control vs. RSV, control vs. Influenza A, control vs. COVID-19, control vs. all respiratory viruses, and COVID-19 vs. Influenza A/RSV. <b>Method:</b> We proposed a stacking-based ensemble technique, integrating the top three best-performing ML models from the initial results to enhance prediction accuracy by leveraging the strengths of multiple base learners. Key techniques such as feature ranking, standard scaling, and SMOTE were used to address class imbalances, thus enhancing model robustness. SHAP analysis identified crucial metabolites influencing positive predictions, thereby providing valuable insights into diagnostic markers. <b>Results:</b> Our approach not only outperformed existing methods but also revealed top dominant features for predicting COVID-19, including Lysophosphatidylcholine acyl C18:2, Kynurenine, Phenylalanine, Valine, Tyrosine, and Aspartic Acid (Asp). <b>Conclusions:</b> This study demonstrates the effectiveness of leveraging nasopharyngeal metabolome data and stacking-based ensemble techniques for predicting respiratory virus scenarios. The proposed approach enhances prediction accuracy, provides insights into key diagnostic markers, and offers a robust framework for managing respiratory infections.
format Article
id doaj-art-2582023809744a0e967375bc2abd27df
institution Kabale University
issn 2218-1989
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Metabolites
spelling doaj-art-2582023809744a0e967375bc2abd27df2025-01-24T13:41:16ZengMDPI AGMetabolites2218-19892025-01-011514410.3390/metabo15010044A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule MetabolomeMd. Shaheenur Islam Sumon0Md Sakib Abrar Hossain1Haya Al-Sulaiti2Hadi M. Yassine3Muhammad E. H. Chowdhury4Department of Electrical Engineering, Qatar University, Doha P.O. Box 2713, QatarDepartment of Biochemistry, University of Regina, Regina, SK S4S 0A2, CanadaDepartment of Biomedical Sciences, College of Health Sciences, Qatar University, Doha P.O. Box 2713, QatarDepartment of Biomedical Sciences, College of Health Sciences, Qatar University, Doha P.O. Box 2713, QatarDepartment of Electrical Engineering, Qatar University, Doha P.O. Box 2713, Qatar<b>Background/Objectives:</b> Respiratory viruses, including Influenza, RSV, and COVID-19, cause various respiratory infections. Distinguishing these viruses relies on diagnostic methods such as PCR testing. Challenges stem from overlapping symptoms and the emergence of new strains. Advanced diagnostics are crucial for accurate detection and effective management. This study leveraged nasopharyngeal metabolome data to predict respiratory virus scenarios including control vs. RSV, control vs. Influenza A, control vs. COVID-19, control vs. all respiratory viruses, and COVID-19 vs. Influenza A/RSV. <b>Method:</b> We proposed a stacking-based ensemble technique, integrating the top three best-performing ML models from the initial results to enhance prediction accuracy by leveraging the strengths of multiple base learners. Key techniques such as feature ranking, standard scaling, and SMOTE were used to address class imbalances, thus enhancing model robustness. SHAP analysis identified crucial metabolites influencing positive predictions, thereby providing valuable insights into diagnostic markers. <b>Results:</b> Our approach not only outperformed existing methods but also revealed top dominant features for predicting COVID-19, including Lysophosphatidylcholine acyl C18:2, Kynurenine, Phenylalanine, Valine, Tyrosine, and Aspartic Acid (Asp). <b>Conclusions:</b> This study demonstrates the effectiveness of leveraging nasopharyngeal metabolome data and stacking-based ensemble techniques for predicting respiratory virus scenarios. The proposed approach enhances prediction accuracy, provides insights into key diagnostic markers, and offers a robust framework for managing respiratory infections.https://www.mdpi.com/2218-1989/15/1/44metabolomicsrespiratory virusesmachine learningdiagnostic markersCOVID-19
spellingShingle Md. Shaheenur Islam Sumon
Md Sakib Abrar Hossain
Haya Al-Sulaiti
Hadi M. Yassine
Muhammad E. H. Chowdhury
A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule Metabolome
Metabolites
metabolomics
respiratory viruses
machine learning
diagnostic markers
COVID-19
title A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule Metabolome
title_full A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule Metabolome
title_fullStr A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule Metabolome
title_full_unstemmed A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule Metabolome
title_short A Comprehensive Machine Learning Approach for COVID-19 Target Discovery in the Small-Molecule Metabolome
title_sort comprehensive machine learning approach for covid 19 target discovery in the small molecule metabolome
topic metabolomics
respiratory viruses
machine learning
diagnostic markers
COVID-19
url https://www.mdpi.com/2218-1989/15/1/44
work_keys_str_mv AT mdshaheenurislamsumon acomprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome
AT mdsakibabrarhossain acomprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome
AT hayaalsulaiti acomprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome
AT hadimyassine acomprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome
AT muhammadehchowdhury acomprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome
AT mdshaheenurislamsumon comprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome
AT mdsakibabrarhossain comprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome
AT hayaalsulaiti comprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome
AT hadimyassine comprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome
AT muhammadehchowdhury comprehensivemachinelearningapproachforcovid19targetdiscoveryinthesmallmoleculemetabolome