Improving Cardiovascular Disease Prediction With Deep Learning and Correlation-Aware SMOTE

Cardiovascular disease (CVD) ranks among the top causes of mortality globally, underscoring the urgent necessity for advanced predictive models to enhance early detection and preventative measures. In this direction, this study investigates the performance of five well-established deep learning (DL)...

Full description

Saved in:
Bibliographic Details
Main Authors: Maria Trigka, Elias Dritsas
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10916648/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849770730645880832
author Maria Trigka
Elias Dritsas
author_facet Maria Trigka
Elias Dritsas
author_sort Maria Trigka
collection DOAJ
description Cardiovascular disease (CVD) ranks among the top causes of mortality globally, underscoring the urgent necessity for advanced predictive models to enhance early detection and preventative measures. In this direction, this study investigates the performance of five well-established deep learning (DL) models, namely Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Autoencoder in predicting CVD using a diverse patient dataset. To tackle the prevalent class imbalance issue in medical datasets, we introduce an enhanced Synthetic Minority Over-sampling Technique (SMOTE). This innovative technique enhances traditional SMOTE by incorporating feature correlations to produce more realistic synthetic samples. We compare model performance across three scenarios: without SMOTE, with traditional SMOTE, and with enhanced SMOTE, using metrics such as Accuracy, Precision, Recall, F1-Score, and Area Under the Curve (AUC). Our results show that the enhanced SMOTE significantly improves model performance, especially in recall and AUC-ROC. Notably, the CNN model with enhanced SMOTE prevailed, achieving the highest overall performance with an AUC of 0.90, an Accuracy of 0.91, a Precision of 0.89, a Recall of 0.86, and an F1-Score equal to 0.87, making it the most effective model in this study. This research highlights the potential of the enhanced SMOTE in developing robust predictive models for CVD, with broader implications for healthcare analytics.
format Article
id doaj-art-b621fc2b81484f5c94054e349288aa8f
institution DOAJ
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-b621fc2b81484f5c94054e349288aa8f2025-08-20T03:02:55ZengIEEEIEEE Access2169-35362025-01-0113445904460610.1109/ACCESS.2025.354941710916648Improving Cardiovascular Disease Prediction With Deep Learning and Correlation-Aware SMOTEMaria Trigka0https://orcid.org/0000-0001-7793-0407Elias Dritsas1https://orcid.org/0000-0001-5647-2929Department of Informatics and Computer Engineering, University of West Attica, Egaleo Park Campus, Athens, GreeceDepartment of Informatics and Computer Engineering, University of West Attica, Egaleo Park Campus, Athens, GreeceCardiovascular disease (CVD) ranks among the top causes of mortality globally, underscoring the urgent necessity for advanced predictive models to enhance early detection and preventative measures. In this direction, this study investigates the performance of five well-established deep learning (DL) models, namely Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Autoencoder in predicting CVD using a diverse patient dataset. To tackle the prevalent class imbalance issue in medical datasets, we introduce an enhanced Synthetic Minority Over-sampling Technique (SMOTE). This innovative technique enhances traditional SMOTE by incorporating feature correlations to produce more realistic synthetic samples. We compare model performance across three scenarios: without SMOTE, with traditional SMOTE, and with enhanced SMOTE, using metrics such as Accuracy, Precision, Recall, F1-Score, and Area Under the Curve (AUC). Our results show that the enhanced SMOTE significantly improves model performance, especially in recall and AUC-ROC. Notably, the CNN model with enhanced SMOTE prevailed, achieving the highest overall performance with an AUC of 0.90, an Accuracy of 0.91, a Precision of 0.89, a Recall of 0.86, and an F1-Score equal to 0.87, making it the most effective model in this study. This research highlights the potential of the enhanced SMOTE in developing robust predictive models for CVD, with broader implications for healthcare analytics.https://ieeexplore.ieee.org/document/10916648/Cardiovascular disease predictiondeep learningenhanced SMOTEclass imbalancehealthcare analytics
spellingShingle Maria Trigka
Elias Dritsas
Improving Cardiovascular Disease Prediction With Deep Learning and Correlation-Aware SMOTE
IEEE Access
Cardiovascular disease prediction
deep learning
enhanced SMOTE
class imbalance
healthcare analytics
title Improving Cardiovascular Disease Prediction With Deep Learning and Correlation-Aware SMOTE
title_full Improving Cardiovascular Disease Prediction With Deep Learning and Correlation-Aware SMOTE
title_fullStr Improving Cardiovascular Disease Prediction With Deep Learning and Correlation-Aware SMOTE
title_full_unstemmed Improving Cardiovascular Disease Prediction With Deep Learning and Correlation-Aware SMOTE
title_short Improving Cardiovascular Disease Prediction With Deep Learning and Correlation-Aware SMOTE
title_sort improving cardiovascular disease prediction with deep learning and correlation aware smote
topic Cardiovascular disease prediction
deep learning
enhanced SMOTE
class imbalance
healthcare analytics
url https://ieeexplore.ieee.org/document/10916648/
work_keys_str_mv AT mariatrigka improvingcardiovasculardiseasepredictionwithdeeplearningandcorrelationawaresmote
AT eliasdritsas improvingcardiovasculardiseasepredictionwithdeeplearningandcorrelationawaresmote