The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instance...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2020-01-01
|
| Series: | Applied Bionics and Biomechanics |
| Online Access: | http://dx.doi.org/10.1155/2020/8824625 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849306182186958848 |
|---|---|
| author | Zina Z. R. Al-Shamaa Sefer Kurnaz Adil Deniz Duru Nadia Peppa Alex H. Mirnezami Zaed Z. R. Hamady |
| author_facet | Zina Z. R. Al-Shamaa Sefer Kurnaz Adil Deniz Duru Nadia Peppa Alex H. Mirnezami Zaed Z. R. Hamady |
| author_sort | Zina Z. R. Al-Shamaa |
| collection | DOAJ |
| description | Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy. |
| format | Article |
| id | doaj-art-38c68a5cbfb74f58b29b369afac5bc96 |
| institution | Kabale University |
| issn | 1176-2322 1754-2103 |
| language | English |
| publishDate | 2020-01-01 |
| publisher | Wiley |
| record_format | Article |
| series | Applied Bionics and Biomechanics |
| spelling | doaj-art-38c68a5cbfb74f58b29b369afac5bc962025-08-20T03:55:11ZengWileyApplied Bionics and Biomechanics1176-23221754-21032020-01-01202010.1155/2020/88246258824625The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical DatasetsZina Z. R. Al-Shamaa0Sefer Kurnaz1Adil Deniz Duru2Nadia Peppa3Alex H. Mirnezami4Zaed Z. R. Hamady5Graduate School of Science and Engineering, Altınbaş University, Istanbul, TurkeyGraduate School of Science and Engineering, Altınbaş University, Istanbul, TurkeySports and Health Sciences Department, Marmara University, Istanbul, TurkeySouthampton University Hospital NHSFT, Southampton, UKSouthampton University Hospital NHSFT, Southampton, UKSouthampton University Hospital NHSFT, Southampton, UKImbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy.http://dx.doi.org/10.1155/2020/8824625 |
| spellingShingle | Zina Z. R. Al-Shamaa Sefer Kurnaz Adil Deniz Duru Nadia Peppa Alex H. Mirnezami Zaed Z. R. Hamady The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets Applied Bionics and Biomechanics |
| title | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
| title_full | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
| title_fullStr | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
| title_full_unstemmed | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
| title_short | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
| title_sort | use of hellinger distance undersampling model to improve the classification of disease class in imbalanced medical datasets |
| url | http://dx.doi.org/10.1155/2020/8824625 |
| work_keys_str_mv | AT zinazralshamaa theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT seferkurnaz theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT adildenizduru theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT nadiapeppa theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT alexhmirnezami theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT zaedzrhamady theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT zinazralshamaa useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT seferkurnaz useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT adildenizduru useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT nadiapeppa useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT alexhmirnezami useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT zaedzrhamady useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets |