Enhancing the Classification of Imbalanced Arabic Medical Questions Using DeepSMOTE

The growing demand for telemedicine has highlighted the need for automated healthcare services, particularly in medical question classification. This study presents a deep learning model designed to address key challenges in telemedicine, including class imbalance and accurate routing of Arabic medi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Bushra Al-Smadi, Bassam Hammo, Hossam Faris, Pedro A. Castillo
Format:	Article
Language:	English
Published:	MDPI AG 2025-04-01
Series:	AI
Subjects:	DeepSMOTE multi-class classification oversampling techniques medical questions Arabic language
Online Access:	https://www.mdpi.com/2673-2688/6/4/77
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849712235823235072
author	Bushra Al-Smadi Bassam Hammo Hossam Faris Pedro A. Castillo
author_facet	Bushra Al-Smadi Bassam Hammo Hossam Faris Pedro A. Castillo
author_sort	Bushra Al-Smadi
collection	DOAJ
description	The growing demand for telemedicine has highlighted the need for automated healthcare services, particularly in medical question classification. This study presents a deep learning model designed to address key challenges in telemedicine, including class imbalance and accurate routing of Arabic medical questions to the correct specialties. The model combines AraBERTv0.2-Twitter, fine-tuned for informal Arabic, with Bidirectional Long Short-Term Memory (BiLSTM) networks to capture deep semantic relationships in medical text. We used a labeled dataset of 5000 Arabic consultation records from Altibbi, covering five key medical specialties selected for their clinical relevance and frequency. The data underwent preprocessing to remove noise and normalize text. We employed stratified sampling to ensure representative distribution across the selected medical specialties. We evaluate multiple models using macro precision, macro recall, macro F1-score, weighted F1-score, and G-Mean. Our results demonstrate that DeepSMOTE combined with cross-entropy loss achieves the best performance. The findings offer statistically significant improvements and have practical implications for improving screening and patient routing in telemedicine platforms.
format	Article
id	doaj-art-60c73cc044a14b359ff94de776a42d08
institution	DOAJ
issn	2673-2688
language	English
publishDate	2025-04-01
publisher	MDPI AG
record_format	Article
series	AI
spelling	doaj-art-60c73cc044a14b359ff94de776a42d082025-08-20T03:14:20ZengMDPI AGAI2673-26882025-04-01647710.3390/ai6040077Enhancing the Classification of Imbalanced Arabic Medical Questions Using DeepSMOTEBushra Al-Smadi0Bassam Hammo1Hossam Faris2Pedro A. Castillo3King Abdullah II School of Information Technology, The University of Jordan, Amman 11942, JordanKing Abdullah II School of Information Technology, The University of Jordan, Amman 11942, JordanKing Abdullah II School of Information Technology, The University of Jordan, Amman 11942, JordanDepartment of Computer Engineering, Automatics and Robotics, Higher Technical School of Computer Sciences and Telecommunications Engineering (ETSIIT)-Communication and Information Technologies Researching Centre (CITIC), University of Granada, 18071 Granada, SpainThe growing demand for telemedicine has highlighted the need for automated healthcare services, particularly in medical question classification. This study presents a deep learning model designed to address key challenges in telemedicine, including class imbalance and accurate routing of Arabic medical questions to the correct specialties. The model combines AraBERTv0.2-Twitter, fine-tuned for informal Arabic, with Bidirectional Long Short-Term Memory (BiLSTM) networks to capture deep semantic relationships in medical text. We used a labeled dataset of 5000 Arabic consultation records from Altibbi, covering five key medical specialties selected for their clinical relevance and frequency. The data underwent preprocessing to remove noise and normalize text. We employed stratified sampling to ensure representative distribution across the selected medical specialties. We evaluate multiple models using macro precision, macro recall, macro F1-score, weighted F1-score, and G-Mean. Our results demonstrate that DeepSMOTE combined with cross-entropy loss achieves the best performance. The findings offer statistically significant improvements and have practical implications for improving screening and patient routing in telemedicine platforms.https://www.mdpi.com/2673-2688/6/4/77DeepSMOTEmulti-class classificationoversampling techniquesmedical questionsArabic language
spellingShingle	Bushra Al-Smadi Bassam Hammo Hossam Faris Pedro A. Castillo Enhancing the Classification of Imbalanced Arabic Medical Questions Using DeepSMOTE AI DeepSMOTE multi-class classification oversampling techniques medical questions Arabic language
title	Enhancing the Classification of Imbalanced Arabic Medical Questions Using DeepSMOTE
title_full	Enhancing the Classification of Imbalanced Arabic Medical Questions Using DeepSMOTE
title_fullStr	Enhancing the Classification of Imbalanced Arabic Medical Questions Using DeepSMOTE
title_full_unstemmed	Enhancing the Classification of Imbalanced Arabic Medical Questions Using DeepSMOTE
title_short	Enhancing the Classification of Imbalanced Arabic Medical Questions Using DeepSMOTE
title_sort	enhancing the classification of imbalanced arabic medical questions using deepsmote
topic	DeepSMOTE multi-class classification oversampling techniques medical questions Arabic language
url	https://www.mdpi.com/2673-2688/6/4/77
work_keys_str_mv	AT bushraalsmadi enhancingtheclassificationofimbalancedarabicmedicalquestionsusingdeepsmote AT bassamhammo enhancingtheclassificationofimbalancedarabicmedicalquestionsusingdeepsmote AT hossamfaris enhancingtheclassificationofimbalancedarabicmedicalquestionsusingdeepsmote AT pedroacastillo enhancingtheclassificationofimbalancedarabicmedicalquestionsusingdeepsmote

Enhancing the Classification of Imbalanced Arabic Medical Questions Using DeepSMOTE

Similar Items