Hybrid Random Feature Selection and Recurrent Neural Network for Diabetes Prediction

This paper proposes a novel two-stage ensemble framework combining Long Short-Term Memory (LSTM) and Bidirectional LSTM (BiLSTM) with randomized feature selection to enhance diabetes prediction accuracy and calibration. The method first trains multiple LSTM/BiLSTM base models on dynamically sampled...

Full description

Saved in:

Bibliographic Details
Main Authors:	Oyebayo Ridwan Olaniran, Aliu Omotayo Sikiru, Jeza Allohibi, Abdulmajeed Atiah Alharbi, Nada MohammedSaeed Alharbi
Format:	Article
Language:	English
Published:	MDPI AG 2025-02-01
Series:	Mathematics
Subjects:	recurrent neural network long short-term memory diabetes prediction ensemble learning
Online Access:	https://www.mdpi.com/2227-7390/13/4/628
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper proposes a novel two-stage ensemble framework combining Long Short-Term Memory (LSTM) and Bidirectional LSTM (BiLSTM) with randomized feature selection to enhance diabetes prediction accuracy and calibration. The method first trains multiple LSTM/BiLSTM base models on dynamically sampled feature subsets to promote diversity, followed by a meta-learner that integrates predictions into a final robust output. A systematic simulation study conducted reveals that feature selection proportion critically impacts generalization: mid-range values (0.5–0.8 for LSTM; 0.6–0.8 for BiLSTM) optimize performance, while values close to 1 induce overfitting. Furthermore, real-life data evaluation on three benchmark datasets—Pima Indian Diabetes, Diabetic Retinopathy Debrecen, and Early Stage Diabetes Risk Prediction—revealed that the framework achieves state-of-the-art results, surpassing conventional (random forest, support vector machine) and recent hybrid frameworks with an accuracy of up to 100%, AUC of 99.1–100%, and superior calibration (Brier score: 0.006–0.023). Notably, the BiLSTM variant consistently outperforms unidirectional LSTM in the proposed framework, particularly in sensitivity (98.4% vs. 97.0% on retinopathy data), highlighting its strength in capturing temporal dependencies.
ISSN:	2227-7390

Hybrid Random Feature Selection and Recurrent Neural Network for Diabetes Prediction

Similar Items