Early Prediction of Stroke Risk Using Machine Learning Approaches and Imbalanced Data

Classifying medical datasets using machine learning algorithms could help physicians to provide accurate diagnosing and suitable treatment. For instance, stroke is one of the serious diseases that attacks many patients annually, and analyzing it is symptoms in advance could save patients’ lives. Th...

Full description

Saved in:
Bibliographic Details
Main Author: Hassan Qassim
Format: Article
Language:English
Published: Northern Technical University 2025-03-01
Series:NTU Journal of Engineering and Technology
Subjects:
Online Access:https://journals.ntu.edu.iq/index.php/NTU-JET/article/view/1172
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Classifying medical datasets using machine learning algorithms could help physicians to provide accurate diagnosing and suitable treatment. For instance, stroke is one of the serious diseases that attacks many patients annually, and analyzing it is symptoms in advance could save patients’ lives. The warning signs of the stroke can be investigated to be used as attributes or predictors for machine learning models. This study evaluates the performance of four machine learning models to classify stroke datasets. Specifically, Decision Tree, Naïve Bayes, K- Nearest Neighbor (KNN) and Linear discriminant Analyses (LDA) models were trained on 11 attributes collected from 5110 patients to predict stroke risk. The findings showed that KNN outperformed the three other models with an achieved accuracy of 90%. The study also considered balancing the employed data prior validating the models to provide accurate classification. Cross-validation technique was used to avoid over-fitting and under-fitting during training phases.   
ISSN:2788-9971
2788-998X