The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets

Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instance...

Full description

Saved in:
Bibliographic Details
Main Authors: Zina Z. R. Al-Shamaa, Sefer Kurnaz, Adil Deniz Duru, Nadia Peppa, Alex H. Mirnezami, Zaed Z. R. Hamady
Format: Article
Language:English
Published: Wiley 2020-01-01
Series:Applied Bionics and Biomechanics
Online Access:http://dx.doi.org/10.1155/2020/8824625
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849306182186958848
author Zina Z. R. Al-Shamaa
Sefer Kurnaz
Adil Deniz Duru
Nadia Peppa
Alex H. Mirnezami
Zaed Z. R. Hamady
author_facet Zina Z. R. Al-Shamaa
Sefer Kurnaz
Adil Deniz Duru
Nadia Peppa
Alex H. Mirnezami
Zaed Z. R. Hamady
author_sort Zina Z. R. Al-Shamaa
collection DOAJ
description Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy.
format Article
id doaj-art-38c68a5cbfb74f58b29b369afac5bc96
institution Kabale University
issn 1176-2322
1754-2103
language English
publishDate 2020-01-01
publisher Wiley
record_format Article
series Applied Bionics and Biomechanics
spelling doaj-art-38c68a5cbfb74f58b29b369afac5bc962025-08-20T03:55:11ZengWileyApplied Bionics and Biomechanics1176-23221754-21032020-01-01202010.1155/2020/88246258824625The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical DatasetsZina Z. R. Al-Shamaa0Sefer Kurnaz1Adil Deniz Duru2Nadia Peppa3Alex H. Mirnezami4Zaed Z. R. Hamady5Graduate School of Science and Engineering, Altınbaş University, Istanbul, TurkeyGraduate School of Science and Engineering, Altınbaş University, Istanbul, TurkeySports and Health Sciences Department, Marmara University, Istanbul, TurkeySouthampton University Hospital NHSFT, Southampton, UKSouthampton University Hospital NHSFT, Southampton, UKSouthampton University Hospital NHSFT, Southampton, UKImbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy.http://dx.doi.org/10.1155/2020/8824625
spellingShingle Zina Z. R. Al-Shamaa
Sefer Kurnaz
Adil Deniz Duru
Nadia Peppa
Alex H. Mirnezami
Zaed Z. R. Hamady
The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
Applied Bionics and Biomechanics
title The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_full The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_fullStr The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_full_unstemmed The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_short The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_sort use of hellinger distance undersampling model to improve the classification of disease class in imbalanced medical datasets
url http://dx.doi.org/10.1155/2020/8824625
work_keys_str_mv AT zinazralshamaa theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT seferkurnaz theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT adildenizduru theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT nadiapeppa theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT alexhmirnezami theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT zaedzrhamady theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT zinazralshamaa useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT seferkurnaz useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT adildenizduru useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT nadiapeppa useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT alexhmirnezami useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT zaedzrhamady useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets