Data Mining for the Adjustment of Credit Scoring Models in Solidarity Economy Entities: A Methodology for Addressing Class Imbalances

This study addresses the quantification of credit risk in solidarity economy entities, proposing a new methodology to redefine the concept of a “default” in the frequent situations of extreme class imbalances. The objective is to develop and evaluate credit scoring models that enhance risk managemen...

Full description

Saved in:
Bibliographic Details
Main Authors: Ivan Mauricio Bermudez Vera, Jaime Mosquera Restrepo, Diego Fernando Manotas-Duque
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Risks
Subjects:
Online Access:https://www.mdpi.com/2227-9091/13/2/20
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850080392023900160
author Ivan Mauricio Bermudez Vera
Jaime Mosquera Restrepo
Diego Fernando Manotas-Duque
author_facet Ivan Mauricio Bermudez Vera
Jaime Mosquera Restrepo
Diego Fernando Manotas-Duque
author_sort Ivan Mauricio Bermudez Vera
collection DOAJ
description This study addresses the quantification of credit risk in solidarity economy entities, proposing a new methodology to redefine the concept of a “default” in the frequent situations of extreme class imbalances. The objective is to develop and evaluate credit scoring models that enhance risk management by incorporating internal and external data to assess default risk. Data mining techniques are applied to address class imbalances, redefining the term “default” to include external credit information and increasing the representation of the minority class. The effectiveness of machine learning and statistical models is evaluated using class-balancing methods such as under-sampling, over-sampling, and the Synthetic Minority Over-sampling Technique (SMOTE). The evaluation is based on the Balanced Accuracy metric and the holding power of the performance, ensuring a consistent predictive power of the model while avoiding overfitting. While machine learning methods can improve credit scoring, logistic regression-based models remain effective, especially when combined with class-balancing techniques. It is concluded that a balanced sample in a class size is essential to improve predictive performance.
format Article
id doaj-art-e28cf82803d44334bc2155b83532b894
institution DOAJ
issn 2227-9091
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Risks
spelling doaj-art-e28cf82803d44334bc2155b83532b8942025-08-20T02:44:56ZengMDPI AGRisks2227-90912025-01-011322010.3390/risks13020020Data Mining for the Adjustment of Credit Scoring Models in Solidarity Economy Entities: A Methodology for Addressing Class ImbalancesIvan Mauricio Bermudez Vera0Jaime Mosquera Restrepo1Diego Fernando Manotas-Duque2School of Industrial Engineering, Universidad del Valle, Cali 760042, ColombiaSchool of Statistics, Universidad del Valle, Cali 760042, ColombiaSchool of Industrial Engineering, Universidad del Valle, Cali 760042, ColombiaThis study addresses the quantification of credit risk in solidarity economy entities, proposing a new methodology to redefine the concept of a “default” in the frequent situations of extreme class imbalances. The objective is to develop and evaluate credit scoring models that enhance risk management by incorporating internal and external data to assess default risk. Data mining techniques are applied to address class imbalances, redefining the term “default” to include external credit information and increasing the representation of the minority class. The effectiveness of machine learning and statistical models is evaluated using class-balancing methods such as under-sampling, over-sampling, and the Synthetic Minority Over-sampling Technique (SMOTE). The evaluation is based on the Balanced Accuracy metric and the holding power of the performance, ensuring a consistent predictive power of the model while avoiding overfitting. While machine learning methods can improve credit scoring, logistic regression-based models remain effective, especially when combined with class-balancing techniques. It is concluded that a balanced sample in a class size is essential to improve predictive performance.https://www.mdpi.com/2227-9091/13/2/20credit risksolidarity economydata miningclass balancinglogistic regression
spellingShingle Ivan Mauricio Bermudez Vera
Jaime Mosquera Restrepo
Diego Fernando Manotas-Duque
Data Mining for the Adjustment of Credit Scoring Models in Solidarity Economy Entities: A Methodology for Addressing Class Imbalances
Risks
credit risk
solidarity economy
data mining
class balancing
logistic regression
title Data Mining for the Adjustment of Credit Scoring Models in Solidarity Economy Entities: A Methodology for Addressing Class Imbalances
title_full Data Mining for the Adjustment of Credit Scoring Models in Solidarity Economy Entities: A Methodology for Addressing Class Imbalances
title_fullStr Data Mining for the Adjustment of Credit Scoring Models in Solidarity Economy Entities: A Methodology for Addressing Class Imbalances
title_full_unstemmed Data Mining for the Adjustment of Credit Scoring Models in Solidarity Economy Entities: A Methodology for Addressing Class Imbalances
title_short Data Mining for the Adjustment of Credit Scoring Models in Solidarity Economy Entities: A Methodology for Addressing Class Imbalances
title_sort data mining for the adjustment of credit scoring models in solidarity economy entities a methodology for addressing class imbalances
topic credit risk
solidarity economy
data mining
class balancing
logistic regression
url https://www.mdpi.com/2227-9091/13/2/20
work_keys_str_mv AT ivanmauriciobermudezvera dataminingfortheadjustmentofcreditscoringmodelsinsolidarityeconomyentitiesamethodologyforaddressingclassimbalances
AT jaimemosquerarestrepo dataminingfortheadjustmentofcreditscoringmodelsinsolidarityeconomyentitiesamethodologyforaddressingclassimbalances
AT diegofernandomanotasduque dataminingfortheadjustmentofcreditscoringmodelsinsolidarityeconomyentitiesamethodologyforaddressingclassimbalances