Machine Learning-Based Ensemble Feature Selection and Nested Cross-Validation for miRNA Biomarker Discovery in Usher Syndrome

Usher syndrome (USH) is a rare genetic disorder affecting vision, hearing, and balance. Identifying reliable biomarkers is crucial for early diagnosis and understanding disease mechanisms. MicroRNAs (miRNAs), key regulators of gene expression, hold promise as biomarkers for USH. This study aimed to...

Full description

Saved in:
Bibliographic Details
Main Authors: Rama Krishna Thelagathoti, Dinesh S. Chandel, Wesley A. Tom, Chao Jiang, Gary Krzyzanowski, Appolinaire Olou, M. Rohan Fernando
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Bioengineering
Subjects:
Online Access:https://www.mdpi.com/2306-5354/12/5/497
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850257965934706688
author Rama Krishna Thelagathoti
Dinesh S. Chandel
Wesley A. Tom
Chao Jiang
Gary Krzyzanowski
Appolinaire Olou
M. Rohan Fernando
author_facet Rama Krishna Thelagathoti
Dinesh S. Chandel
Wesley A. Tom
Chao Jiang
Gary Krzyzanowski
Appolinaire Olou
M. Rohan Fernando
author_sort Rama Krishna Thelagathoti
collection DOAJ
description Usher syndrome (USH) is a rare genetic disorder affecting vision, hearing, and balance. Identifying reliable biomarkers is crucial for early diagnosis and understanding disease mechanisms. MicroRNAs (miRNAs), key regulators of gene expression, hold promise as biomarkers for USH. This study aimed to identify a minimal subset of miRNAs that could serve as biomarkers to effectively differentiate USH from controls. We employed ensemble feature selection techniques to select the top miRNAs appearing in at least three algorithms. Machine learning models were trained and tested using this subset, followed by validation on an independent 10% sample. Our approach identified 10 key miRNAs as potential biomarkers for USH. To further validate their biological relevance, we conducted pathway analysis, which revealed significant pathways associated with USH. Furthermore, our approach achieved high classification performance, with an accuracy of 97.7%, sensitivity of 98%, specificity of 92.5%, F1 score of 95.8%, and an AUC of 97.5%. These findings demonstrate that combining ensemble feature selection with machine learning provides a robust strategy for miRNA biomarker discovery, advancing USH diagnosis and molecular understanding.
format Article
id doaj-art-36c25a879cdf4ba1981dec9fcdc6d2bc
institution OA Journals
issn 2306-5354
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Bioengineering
spelling doaj-art-36c25a879cdf4ba1981dec9fcdc6d2bc2025-08-20T01:56:17ZengMDPI AGBioengineering2306-53542025-05-0112549710.3390/bioengineering12050497Machine Learning-Based Ensemble Feature Selection and Nested Cross-Validation for miRNA Biomarker Discovery in Usher SyndromeRama Krishna Thelagathoti0Dinesh S. Chandel1Wesley A. Tom2Chao Jiang3Gary Krzyzanowski4Appolinaire Olou5M. Rohan Fernando6Molecular Diagnostic Research Laboratory, Center for Sensory Neuroscience, Boys Town National Research Hospital, Omaha, NE 68010, USAMolecular Diagnostic Research Laboratory, Center for Sensory Neuroscience, Boys Town National Research Hospital, Omaha, NE 68010, USAMolecular Diagnostic Research Laboratory, Center for Sensory Neuroscience, Boys Town National Research Hospital, Omaha, NE 68010, USAMolecular Diagnostic Research Laboratory, Center for Sensory Neuroscience, Boys Town National Research Hospital, Omaha, NE 68010, USAMolecular Diagnostic Research Laboratory, Center for Sensory Neuroscience, Boys Town National Research Hospital, Omaha, NE 68010, USAMolecular Diagnostic Research Laboratory, Center for Sensory Neuroscience, Boys Town National Research Hospital, Omaha, NE 68010, USAMolecular Diagnostic Research Laboratory, Center for Sensory Neuroscience, Boys Town National Research Hospital, Omaha, NE 68010, USAUsher syndrome (USH) is a rare genetic disorder affecting vision, hearing, and balance. Identifying reliable biomarkers is crucial for early diagnosis and understanding disease mechanisms. MicroRNAs (miRNAs), key regulators of gene expression, hold promise as biomarkers for USH. This study aimed to identify a minimal subset of miRNAs that could serve as biomarkers to effectively differentiate USH from controls. We employed ensemble feature selection techniques to select the top miRNAs appearing in at least three algorithms. Machine learning models were trained and tested using this subset, followed by validation on an independent 10% sample. Our approach identified 10 key miRNAs as potential biomarkers for USH. To further validate their biological relevance, we conducted pathway analysis, which revealed significant pathways associated with USH. Furthermore, our approach achieved high classification performance, with an accuracy of 97.7%, sensitivity of 98%, specificity of 92.5%, F1 score of 95.8%, and an AUC of 97.5%. These findings demonstrate that combining ensemble feature selection with machine learning provides a robust strategy for miRNA biomarker discovery, advancing USH diagnosis and molecular understanding.https://www.mdpi.com/2306-5354/12/5/497ensemble feature selectionbiomarker discoveryusher syndromemiRNAmachine learningnested cross-validation
spellingShingle Rama Krishna Thelagathoti
Dinesh S. Chandel
Wesley A. Tom
Chao Jiang
Gary Krzyzanowski
Appolinaire Olou
M. Rohan Fernando
Machine Learning-Based Ensemble Feature Selection and Nested Cross-Validation for miRNA Biomarker Discovery in Usher Syndrome
Bioengineering
ensemble feature selection
biomarker discovery
usher syndrome
miRNA
machine learning
nested cross-validation
title Machine Learning-Based Ensemble Feature Selection and Nested Cross-Validation for miRNA Biomarker Discovery in Usher Syndrome
title_full Machine Learning-Based Ensemble Feature Selection and Nested Cross-Validation for miRNA Biomarker Discovery in Usher Syndrome
title_fullStr Machine Learning-Based Ensemble Feature Selection and Nested Cross-Validation for miRNA Biomarker Discovery in Usher Syndrome
title_full_unstemmed Machine Learning-Based Ensemble Feature Selection and Nested Cross-Validation for miRNA Biomarker Discovery in Usher Syndrome
title_short Machine Learning-Based Ensemble Feature Selection and Nested Cross-Validation for miRNA Biomarker Discovery in Usher Syndrome
title_sort machine learning based ensemble feature selection and nested cross validation for mirna biomarker discovery in usher syndrome
topic ensemble feature selection
biomarker discovery
usher syndrome
miRNA
machine learning
nested cross-validation
url https://www.mdpi.com/2306-5354/12/5/497
work_keys_str_mv AT ramakrishnathelagathoti machinelearningbasedensemblefeatureselectionandnestedcrossvalidationformirnabiomarkerdiscoveryinushersyndrome
AT dineshschandel machinelearningbasedensemblefeatureselectionandnestedcrossvalidationformirnabiomarkerdiscoveryinushersyndrome
AT wesleyatom machinelearningbasedensemblefeatureselectionandnestedcrossvalidationformirnabiomarkerdiscoveryinushersyndrome
AT chaojiang machinelearningbasedensemblefeatureselectionandnestedcrossvalidationformirnabiomarkerdiscoveryinushersyndrome
AT garykrzyzanowski machinelearningbasedensemblefeatureselectionandnestedcrossvalidationformirnabiomarkerdiscoveryinushersyndrome
AT appolinaireolou machinelearningbasedensemblefeatureselectionandnestedcrossvalidationformirnabiomarkerdiscoveryinushersyndrome
AT mrohanfernando machinelearningbasedensemblefeatureselectionandnestedcrossvalidationformirnabiomarkerdiscoveryinushersyndrome