Predicting classification errors using NLP-based machine learning algorithms and expert opinions

Various intentional and unintentional biases of humans manifest in classification tasks, such as those related to risk management. In this paper we demonstrate the role of ML algorithms when accomplishing these tasks and highlight the role of expert know-how when training the staff as well as, and v...

Full description

Saved in:
Bibliographic Details
Main Authors: Peiheng Gao, Chen Yang, Ning Sun, Ričardas Zitikis
Format: Article
Language:English
Published: Elsevier 2025-03-01
Series:Machine Learning with Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666827025000131
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Various intentional and unintentional biases of humans manifest in classification tasks, such as those related to risk management. In this paper we demonstrate the role of ML algorithms when accomplishing these tasks and highlight the role of expert know-how when training the staff as well as, and very importantly, when training and fine-tuning ML algorithms. In the process of doing so and when facing well-known inefficiencies of the traditional F1 score, especially when working with unbalanced datasets, we suggest a modification of the score by incorporating human-experience-trained algorithms, which include both expert-trained algorithms (i.e., with the involvement of expert experiences in classification tasks) and staff-trained algorithms (i.e., with the involvement of experiences of those staff who have been trained by experts). Our findings reveal that the modified F1 score diverges from the traditional staff F1 score when the staff labels exhibit weak correlation with expert labels, which indicates insufficient staff training. Furthermore, the Long Short-Term Memory (LSTM) model outperforms other classifiers in terms of the modified F1 score when applied to the classification of textual narratives in consumer complaints.
ISSN:2666-8270