Comparison of Support Vector Machine (SVM) and Random Forest (RF) Algorithm Performance with Random Undersampling Technique to Predict Gestational Diabetes Mellitus Risk

Gestational Diabetes Mellitus (GDM) is a condition of glucose intolerance that develops during pregnancy until the birth process, which is characterized by an abnormal increase in blood sugar levels. Accurate early diagnosis is very important to provide information that can accelerate the treatment...

Full description

Saved in:
Bibliographic Details
Main Authors: Annisa Damayanti, Anna Baita
Format: Article
Language:English
Published: Politeknik Negeri Batam 2025-03-01
Series:Journal of Applied Informatics and Computing
Subjects:
Online Access:https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/9009
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849714235319255040
author Annisa Damayanti
Anna Baita
author_facet Annisa Damayanti
Anna Baita
author_sort Annisa Damayanti
collection DOAJ
description Gestational Diabetes Mellitus (GDM) is a condition of glucose intolerance that develops during pregnancy until the birth process, which is characterized by an abnormal increase in blood sugar levels. Accurate early diagnosis is very important to provide information that can accelerate the treatment process and reduce complications in the mother and baby. One of the machine learning methods that can be used to predict GDM is the Support Vector Machine (SVM) algorithm and the Random Forest (RF) algorithm. This study aims to compare, and evaluate GDM disease prediction models using the SVM and RF algorithms by balancing the target data using the Random Undersampling Technique. The approach using the random undersampling technique managed to increase accuracy by 18% from the accuracy before using the random undersampling technique. The SVM model in this study also uses hyperparameter tuning with kernel parameters, C (cost), and gamma, while the RF model uses Scoring Metrix and four other parameters, namely N_estimators, max_depth, min_samples_split, and min_samples_leaf. The best parameter search process is carried out using GridSearchCV on both models. The results of the study showed that the SVM classification model with random undersampling technique and hyperparameter tuning with K-Fold achieved an average accuracy of 100% with precision, recall, f1-score values also reaching 100%, with the Best Parameter Kernel Linear, C value = 0.1 and gamma value = 0.001 reaching the highest accuracy of 1.0, with a ROC-AUC value of 99% indicating very good prediction performance. While the RF model showed an accuracy result of 99%, tuning was also carried out using the appropriate parameters resulting in the same accuracy of 99%, with a ROC-AUC value of 99% as well. From both models, it shows that the SVM and RF algorithms have very good prediction performance in predicting DMG, but the SVM algorithm can predict DMG better than RF because the number of prediction errors is lower.
format Article
id doaj-art-68c6513bece0488f92fdc8c110cc9e96
institution DOAJ
issn 2548-6861
language English
publishDate 2025-03-01
publisher Politeknik Negeri Batam
record_format Article
series Journal of Applied Informatics and Computing
spelling doaj-art-68c6513bece0488f92fdc8c110cc9e962025-08-20T03:13:45ZengPoliteknik Negeri BatamJournal of Applied Informatics and Computing2548-68612025-03-019232833710.30871/jaic.v9i2.90096589Comparison of Support Vector Machine (SVM) and Random Forest (RF) Algorithm Performance with Random Undersampling Technique to Predict Gestational Diabetes Mellitus RiskAnnisa Damayanti0Anna Baita1Universitas Amikom YogyakartaUniversitas Amikom YogyakartaGestational Diabetes Mellitus (GDM) is a condition of glucose intolerance that develops during pregnancy until the birth process, which is characterized by an abnormal increase in blood sugar levels. Accurate early diagnosis is very important to provide information that can accelerate the treatment process and reduce complications in the mother and baby. One of the machine learning methods that can be used to predict GDM is the Support Vector Machine (SVM) algorithm and the Random Forest (RF) algorithm. This study aims to compare, and evaluate GDM disease prediction models using the SVM and RF algorithms by balancing the target data using the Random Undersampling Technique. The approach using the random undersampling technique managed to increase accuracy by 18% from the accuracy before using the random undersampling technique. The SVM model in this study also uses hyperparameter tuning with kernel parameters, C (cost), and gamma, while the RF model uses Scoring Metrix and four other parameters, namely N_estimators, max_depth, min_samples_split, and min_samples_leaf. The best parameter search process is carried out using GridSearchCV on both models. The results of the study showed that the SVM classification model with random undersampling technique and hyperparameter tuning with K-Fold achieved an average accuracy of 100% with precision, recall, f1-score values also reaching 100%, with the Best Parameter Kernel Linear, C value = 0.1 and gamma value = 0.001 reaching the highest accuracy of 1.0, with a ROC-AUC value of 99% indicating very good prediction performance. While the RF model showed an accuracy result of 99%, tuning was also carried out using the appropriate parameters resulting in the same accuracy of 99%, with a ROC-AUC value of 99% as well. From both models, it shows that the SVM and RF algorithms have very good prediction performance in predicting DMG, but the SVM algorithm can predict DMG better than RF because the number of prediction errors is lower.https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/9009dmgprediksisvmundersampling
spellingShingle Annisa Damayanti
Anna Baita
Comparison of Support Vector Machine (SVM) and Random Forest (RF) Algorithm Performance with Random Undersampling Technique to Predict Gestational Diabetes Mellitus Risk
Journal of Applied Informatics and Computing
dmg
prediksi
svm
undersampling
title Comparison of Support Vector Machine (SVM) and Random Forest (RF) Algorithm Performance with Random Undersampling Technique to Predict Gestational Diabetes Mellitus Risk
title_full Comparison of Support Vector Machine (SVM) and Random Forest (RF) Algorithm Performance with Random Undersampling Technique to Predict Gestational Diabetes Mellitus Risk
title_fullStr Comparison of Support Vector Machine (SVM) and Random Forest (RF) Algorithm Performance with Random Undersampling Technique to Predict Gestational Diabetes Mellitus Risk
title_full_unstemmed Comparison of Support Vector Machine (SVM) and Random Forest (RF) Algorithm Performance with Random Undersampling Technique to Predict Gestational Diabetes Mellitus Risk
title_short Comparison of Support Vector Machine (SVM) and Random Forest (RF) Algorithm Performance with Random Undersampling Technique to Predict Gestational Diabetes Mellitus Risk
title_sort comparison of support vector machine svm and random forest rf algorithm performance with random undersampling technique to predict gestational diabetes mellitus risk
topic dmg
prediksi
svm
undersampling
url https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/9009
work_keys_str_mv AT annisadamayanti comparisonofsupportvectormachinesvmandrandomforestrfalgorithmperformancewithrandomundersamplingtechniquetopredictgestationaldiabetesmellitusrisk
AT annabaita comparisonofsupportvectormachinesvmandrandomforestrfalgorithmperformancewithrandomundersamplingtechniquetopredictgestationaldiabetesmellitusrisk