Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems.

Focus on predictive algorithm and its performance evaluation is extensively covered in most research studies to determine best or appropriate predictive model with Optimum prediction solution indicated by prediction accuracy score, precision, recall, f1score etc. Prediction accuracy score from perfo...

Full description

Saved in:
Bibliographic Details
Main Authors: Michael Owusu-Adjei, James Ben Hayfron-Acquah, Twum Frimpong, Gaddafi Abdul-Salaam
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-11-01
Series:PLOS Digital Health
Online Access:https://doi.org/10.1371/journal.pdig.0000290
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849400624132653056
author Michael Owusu-Adjei
James Ben Hayfron-Acquah
Twum Frimpong
Gaddafi Abdul-Salaam
author_facet Michael Owusu-Adjei
James Ben Hayfron-Acquah
Twum Frimpong
Gaddafi Abdul-Salaam
author_sort Michael Owusu-Adjei
collection DOAJ
description Focus on predictive algorithm and its performance evaluation is extensively covered in most research studies to determine best or appropriate predictive model with Optimum prediction solution indicated by prediction accuracy score, precision, recall, f1score etc. Prediction accuracy score from performance evaluation has been used extensively as the main determining metric for performance recommendation. It is one of the most widely used metric for identifying optimal prediction solution irrespective of dataset class distribution context or nature of dataset and output class distribution between the minority and majority variables. The key research question however is the impact of class inequality on prediction accuracy score in such datasets with output class distribution imbalance as compared to balanced accuracy score in the determination of model performance in healthcare and other real-world application systems. Answering this question requires an appraisal of current state of knowledge in both prediction accuracy score and balanced accuracy score use in real-world applications where there is unequal class distribution. Review of related works that highlight the use of imbalanced class distribution datasets with evaluation metrics will assist in contextualizing this systematic review.
format Article
id doaj-art-b84eaf9881a24df3af5bee151a80c91b
institution Kabale University
issn 2767-3170
language English
publishDate 2023-11-01
publisher Public Library of Science (PLoS)
record_format Article
series PLOS Digital Health
spelling doaj-art-b84eaf9881a24df3af5bee151a80c91b2025-08-20T03:37:57ZengPublic Library of Science (PLoS)PLOS Digital Health2767-31702023-11-01211e000029010.1371/journal.pdig.0000290Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems.Michael Owusu-AdjeiJames Ben Hayfron-AcquahTwum FrimpongGaddafi Abdul-SalaamFocus on predictive algorithm and its performance evaluation is extensively covered in most research studies to determine best or appropriate predictive model with Optimum prediction solution indicated by prediction accuracy score, precision, recall, f1score etc. Prediction accuracy score from performance evaluation has been used extensively as the main determining metric for performance recommendation. It is one of the most widely used metric for identifying optimal prediction solution irrespective of dataset class distribution context or nature of dataset and output class distribution between the minority and majority variables. The key research question however is the impact of class inequality on prediction accuracy score in such datasets with output class distribution imbalance as compared to balanced accuracy score in the determination of model performance in healthcare and other real-world application systems. Answering this question requires an appraisal of current state of knowledge in both prediction accuracy score and balanced accuracy score use in real-world applications where there is unequal class distribution. Review of related works that highlight the use of imbalanced class distribution datasets with evaluation metrics will assist in contextualizing this systematic review.https://doi.org/10.1371/journal.pdig.0000290
spellingShingle Michael Owusu-Adjei
James Ben Hayfron-Acquah
Twum Frimpong
Gaddafi Abdul-Salaam
Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems.
PLOS Digital Health
title Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems.
title_full Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems.
title_fullStr Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems.
title_full_unstemmed Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems.
title_short Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems.
title_sort imbalanced class distribution and performance evaluation metrics a systematic review of prediction accuracy for determining model performance in healthcare systems
url https://doi.org/10.1371/journal.pdig.0000290
work_keys_str_mv AT michaelowusuadjei imbalancedclassdistributionandperformanceevaluationmetricsasystematicreviewofpredictionaccuracyfordeterminingmodelperformanceinhealthcaresystems
AT jamesbenhayfronacquah imbalancedclassdistributionandperformanceevaluationmetricsasystematicreviewofpredictionaccuracyfordeterminingmodelperformanceinhealthcaresystems
AT twumfrimpong imbalancedclassdistributionandperformanceevaluationmetricsasystematicreviewofpredictionaccuracyfordeterminingmodelperformanceinhealthcaresystems
AT gaddafiabdulsalaam imbalancedclassdistributionandperformanceevaluationmetricsasystematicreviewofpredictionaccuracyfordeterminingmodelperformanceinhealthcaresystems