A robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud

Abstract Healthcare insurance fraud imposes a significant financial burden on healthcare systems worldwide, with annual losses reaching billions of dollars. This study aims to improve fraud detection accuracy using machine learning techniques. Our approach consists of three key stages: data preproce...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zeyu Wang, Xiaofang Chen, Yiwei Wu, Linke Jiang, Shiming Lin, Gang Qiu
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Scientific Reports
Subjects:	Healthcare insurance fraud Machine learning Model ensemble Model interpretability
Online Access:	https://doi.org/10.1038/s41598-024-82062-x
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850273333630730240
author	Zeyu Wang Xiaofang Chen Yiwei Wu Linke Jiang Shiming Lin Gang Qiu
author_facet	Zeyu Wang Xiaofang Chen Yiwei Wu Linke Jiang Shiming Lin Gang Qiu
author_sort	Zeyu Wang
collection	DOAJ
description	Abstract Healthcare insurance fraud imposes a significant financial burden on healthcare systems worldwide, with annual losses reaching billions of dollars. This study aims to improve fraud detection accuracy using machine learning techniques. Our approach consists of three key stages: data preprocessing, model training and integration, and result analysis with feature interpretation. Initially, we examined the dataset’s characteristics and employed embedded and permutation methods to test the performance and runtime of single models under different feature sets, selecting the minimal number of features that could still achieve high performance. We then applied ensemble techniques, including Voting, Weighted, and Stacking methods, to combine different models and compare their performances. Feature interpretation was achieved through partial dependence plots (PDP), SHAP, and LIME, allowing us to understand each feature’s impact on the predictions. Finally, we benchmarked our approach against existing studies to evaluate its advantages and limitations. The findings demonstrate improved fraud detection accuracy and offer insights into the interpretability of machine learning models in this context.
format	Article
id	doaj-art-e4de126d13d440208255d98e9792c29d
institution	OA Journals
issn	2045-2322
language	English
publishDate	2025-01-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-e4de126d13d440208255d98e9792c29d2025-08-20T01:51:31ZengNature PortfolioScientific Reports2045-23222025-01-0115112210.1038/s41598-024-82062-xA robust and interpretable ensemble machine learning model for predicting healthcare insurance fraudZeyu Wang0Xiaofang Chen1Yiwei Wu2Linke Jiang3Shiming Lin4Gang Qiu5School of Informatics, Xiamen UniversityXiang’an Hospital, Xiamen UniversitySchool of Informatics, Xiamen UniversitySchool of Informatics, Xiamen UniversitySchool of Informatics, Xiamen UniversitySchool of Information Engineering, Changji UniversityAbstract Healthcare insurance fraud imposes a significant financial burden on healthcare systems worldwide, with annual losses reaching billions of dollars. This study aims to improve fraud detection accuracy using machine learning techniques. Our approach consists of three key stages: data preprocessing, model training and integration, and result analysis with feature interpretation. Initially, we examined the dataset’s characteristics and employed embedded and permutation methods to test the performance and runtime of single models under different feature sets, selecting the minimal number of features that could still achieve high performance. We then applied ensemble techniques, including Voting, Weighted, and Stacking methods, to combine different models and compare their performances. Feature interpretation was achieved through partial dependence plots (PDP), SHAP, and LIME, allowing us to understand each feature’s impact on the predictions. Finally, we benchmarked our approach against existing studies to evaluate its advantages and limitations. The findings demonstrate improved fraud detection accuracy and offer insights into the interpretability of machine learning models in this context.https://doi.org/10.1038/s41598-024-82062-xHealthcare insurance fraudMachine learningModel ensembleModel interpretability
spellingShingle	Zeyu Wang Xiaofang Chen Yiwei Wu Linke Jiang Shiming Lin Gang Qiu A robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud Scientific Reports Healthcare insurance fraud Machine learning Model ensemble Model interpretability
title	A robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud
title_full	A robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud
title_fullStr	A robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud
title_full_unstemmed	A robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud
title_short	A robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud
title_sort	robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud
topic	Healthcare insurance fraud Machine learning Model ensemble Model interpretability
url	https://doi.org/10.1038/s41598-024-82062-x
work_keys_str_mv	AT zeyuwang arobustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT xiaofangchen arobustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT yiweiwu arobustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT linkejiang arobustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT shiminglin arobustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT gangqiu arobustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT zeyuwang robustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT xiaofangchen robustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT yiweiwu robustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT linkejiang robustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT shiminglin robustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud AT gangqiu robustandinterpretableensemblemachinelearningmodelforpredictinghealthcareinsurancefraud

A robust and interpretable ensemble machine learning model for predicting healthcare insurance fraud

Similar Items