An attention-based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetes

Diabetes is a chronic disease due to higher blood sugar (or Glucose) levels in the blood. This study proposes a novel attention-based loss function and a lightweight artificial neural network (ANN) called Diabetic Lite (DB-Lite) for diabetes prediction in the Pima Indian Diabetes Dataset (PIDD). We...

Full description

Saved in:

Bibliographic Details
Main Authors:	Santanu Roy, Reshma Rachel Cherish, Gifty Roy
Format:	Article
Language:	English
Published:	Elsevier 2025-06-01
Series:	Healthcare Analytics
Subjects:	Artificial neural network Diabetes prediction Class imbalance Attention-based binary cross entropy Synthetic minority oversampling technique (SMOTE)
Online Access:	http://www.sciencedirect.com/science/article/pii/S2772442525000188
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850118084498554880
author	Santanu Roy Reshma Rachel Cherish Gifty Roy
author_facet	Santanu Roy Reshma Rachel Cherish Gifty Roy
author_sort	Santanu Roy
collection	DOAJ
description	Diabetes is a chronic disease due to higher blood sugar (or Glucose) levels in the blood. This study proposes a novel attention-based loss function and a lightweight artificial neural network (ANN) called Diabetic Lite (DB-Lite) for diabetes prediction in the Pima Indian Diabetes Dataset (PIDD). We show that the Pima dataset has many challenges. It is a small and imbalanced dataset; moreover, many features are non-linearly correlated in this dataset. The novelties of this research work are as follows: (i) A novel loss function of attention-based binary cross entropy (ABCE) is proposed for the first time to alleviate the statistical imbalance present within the Pima dataset. This ABCE loss function is incorporated in the DB-Lite model, which is trained from scratch. (ii) A Swish activation function is deployed in the hidden layer of DB-Lite instead of Rectified Linear Unit (ReLU) to deal with the non-linear dependency of features with the final outcome. (iii) The synthetic minority oversampling technique (SMOTE) is used as a pre-processing technique to mitigate the class imbalance problem from the Pima dataset. (iv) An adaptive learning rate is utilized while training the model to speed up the convergence of the DB-Lite model. Our final proposed framework has achieved 99.7% accuracy, 99.4% precision, 99.8% recall, and 99.6% F1 score in testing, which is the best result on this Pima dataset. The Welch t-testing (as a statistical hypothesis testing) and 10-fold cross-validation are utilized to prove the validity of the proposed loss function.
format	Article
id	doaj-art-8a868236b0dc416c9fec6992d172cf36
institution	OA Journals
issn	2772-4425
language	English
publishDate	2025-06-01
publisher	Elsevier
record_format	Article
series	Healthcare Analytics
spelling	doaj-art-8a868236b0dc416c9fec6992d172cf362025-08-20T02:35:57ZengElsevierHealthcare Analytics2772-44252025-06-01710039910.1016/j.health.2025.100399An attention-based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetesSantanu Roy0Reshma Rachel Cherish1Gifty Roy2Pandit Deendayal Energy University (PDEU), Department of Computer Science and Engineering, Gandhinagar, India; Corresponding author.Ramaiah Institute of Technology, Department of Computer Science and Engineering, Bangalore, IndiaChrist (Deemed to be University), Department of Computer Science and Engineering, Bangalore, IndiaDiabetes is a chronic disease due to higher blood sugar (or Glucose) levels in the blood. This study proposes a novel attention-based loss function and a lightweight artificial neural network (ANN) called Diabetic Lite (DB-Lite) for diabetes prediction in the Pima Indian Diabetes Dataset (PIDD). We show that the Pima dataset has many challenges. It is a small and imbalanced dataset; moreover, many features are non-linearly correlated in this dataset. The novelties of this research work are as follows: (i) A novel loss function of attention-based binary cross entropy (ABCE) is proposed for the first time to alleviate the statistical imbalance present within the Pima dataset. This ABCE loss function is incorporated in the DB-Lite model, which is trained from scratch. (ii) A Swish activation function is deployed in the hidden layer of DB-Lite instead of Rectified Linear Unit (ReLU) to deal with the non-linear dependency of features with the final outcome. (iii) The synthetic minority oversampling technique (SMOTE) is used as a pre-processing technique to mitigate the class imbalance problem from the Pima dataset. (iv) An adaptive learning rate is utilized while training the model to speed up the convergence of the DB-Lite model. Our final proposed framework has achieved 99.7% accuracy, 99.4% precision, 99.8% recall, and 99.6% F1 score in testing, which is the best result on this Pima dataset. The Welch t-testing (as a statistical hypothesis testing) and 10-fold cross-validation are utilized to prove the validity of the proposed loss function.http://www.sciencedirect.com/science/article/pii/S2772442525000188Artificial neural networkDiabetes predictionClass imbalanceAttention-based binary cross entropySynthetic minority oversampling technique (SMOTE)
spellingShingle	Santanu Roy Reshma Rachel Cherish Gifty Roy An attention-based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetes Healthcare Analytics Artificial neural network Diabetes prediction Class imbalance Attention-based binary cross entropy Synthetic minority oversampling technique (SMOTE)
title	An attention-based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetes
title_full	An attention-based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetes
title_fullStr	An attention-based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetes
title_full_unstemmed	An attention-based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetes
title_short	An attention-based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetes
title_sort	attention based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetes
topic	Artificial neural network Diabetes prediction Class imbalance Attention-based binary cross entropy Synthetic minority oversampling technique (SMOTE)
url	http://www.sciencedirect.com/science/article/pii/S2772442525000188
work_keys_str_mv	AT santanuroy anattentionbasedlossfunctionandsyntheticminorityoversamplingtechniqueforalleviatingclassimbalanceinpredictingdiabetes AT reshmarachelcherish anattentionbasedlossfunctionandsyntheticminorityoversamplingtechniqueforalleviatingclassimbalanceinpredictingdiabetes AT giftyroy anattentionbasedlossfunctionandsyntheticminorityoversamplingtechniqueforalleviatingclassimbalanceinpredictingdiabetes AT santanuroy attentionbasedlossfunctionandsyntheticminorityoversamplingtechniqueforalleviatingclassimbalanceinpredictingdiabetes AT reshmarachelcherish attentionbasedlossfunctionandsyntheticminorityoversamplingtechniqueforalleviatingclassimbalanceinpredictingdiabetes AT giftyroy attentionbasedlossfunctionandsyntheticminorityoversamplingtechniqueforalleviatingclassimbalanceinpredictingdiabetes

An attention-based loss function and synthetic minority oversampling technique for alleviating class imbalance in predicting diabetes

Similar Items