Combination of machine learning and data envelopment analysis to measure the efficiency of the Tax Service Office

The Tax Service Office, a division of the Directorate General of Taxes, is responsible for providing taxation services to the public and collecting taxes. Achieving tax targets efficiently while utilizing available resources is crucial. To assess the performance efficiency of decision-making units (...

Full description

Saved in:
Bibliographic Details
Main Authors: Shofinurdin Soffan, Arif Bramantoro, Ahmad A. Alzahrani
Format: Article
Language:English
Published: PeerJ Inc. 2025-02-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-2672.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850084746885857280
author Shofinurdin Soffan
Arif Bramantoro
Ahmad A. Alzahrani
author_facet Shofinurdin Soffan
Arif Bramantoro
Ahmad A. Alzahrani
author_sort Shofinurdin Soffan
collection DOAJ
description The Tax Service Office, a division of the Directorate General of Taxes, is responsible for providing taxation services to the public and collecting taxes. Achieving tax targets efficiently while utilizing available resources is crucial. To assess the performance efficiency of decision-making units (DMUs), data envelopment analysis (DEA) is commonly employed. However, ensuring homogeneity among the DMUs is often necessary and requires the application of machine learning clustering techniques. In this study, we propose a three-stage approach: Clustering, DEA, and Regression, to measure the efficiency of all tax service office units. Real datasets from Indonesian tax service offices were used while maintaining strict confidentiality. Unlike previous studies that considered both input and output variables, we focus solely on clustering input variables, as it leads to more objective efficiency values when combining the results from each cluster. The results revealed three clusters with a silhouette score of 0.304 and Davies Bouldin Index of 1.119, demonstrating the effectiveness of fuzzy c-means clustering. Out of 352 DMUs, 225 or approximately 64% were identified as efficient using DEA calculations. We propose a regression algorithm to measure the efficiency of DMUs in new office planning, by determining the values of input and output variables. The optimization of multilayer perceptrons using genetic algorithms reduced the mean squared error by about 75.75%, from 0.0144 to 0.0035. Based on our findings, the overall performance of tax service offices in Indonesia has reached an efficiency level of 64%. These results show a significant improvement over the previous study, in which only about 18% of offices were considered efficient. The main contribution of this research is the development of a comprehensive framework for evaluating and predicting tax office efficiency, providing valuable insights for improving performance.
format Article
id doaj-art-a3ecb1ad871a49b3a77680594bad9da2
institution DOAJ
issn 2376-5992
language English
publishDate 2025-02-01
publisher PeerJ Inc.
record_format Article
series PeerJ Computer Science
spelling doaj-art-a3ecb1ad871a49b3a77680594bad9da22025-08-20T02:43:55ZengPeerJ Inc.PeerJ Computer Science2376-59922025-02-0111e267210.7717/peerj-cs.2672Combination of machine learning and data envelopment analysis to measure the efficiency of the Tax Service OfficeShofinurdin Soffan0Arif Bramantoro1Ahmad A. Alzahrani2Faculty of Information Technology, Universitas Budi Luhur, Jakarta, IndonesiaSchool of Computing and Informatics, Universiti Teknologi Brunei, Bandar Seri Begawan, BruneiFaculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi ArabiaThe Tax Service Office, a division of the Directorate General of Taxes, is responsible for providing taxation services to the public and collecting taxes. Achieving tax targets efficiently while utilizing available resources is crucial. To assess the performance efficiency of decision-making units (DMUs), data envelopment analysis (DEA) is commonly employed. However, ensuring homogeneity among the DMUs is often necessary and requires the application of machine learning clustering techniques. In this study, we propose a three-stage approach: Clustering, DEA, and Regression, to measure the efficiency of all tax service office units. Real datasets from Indonesian tax service offices were used while maintaining strict confidentiality. Unlike previous studies that considered both input and output variables, we focus solely on clustering input variables, as it leads to more objective efficiency values when combining the results from each cluster. The results revealed three clusters with a silhouette score of 0.304 and Davies Bouldin Index of 1.119, demonstrating the effectiveness of fuzzy c-means clustering. Out of 352 DMUs, 225 or approximately 64% were identified as efficient using DEA calculations. We propose a regression algorithm to measure the efficiency of DMUs in new office planning, by determining the values of input and output variables. The optimization of multilayer perceptrons using genetic algorithms reduced the mean squared error by about 75.75%, from 0.0144 to 0.0035. Based on our findings, the overall performance of tax service offices in Indonesia has reached an efficiency level of 64%. These results show a significant improvement over the previous study, in which only about 18% of offices were considered efficient. The main contribution of this research is the development of a comprehensive framework for evaluating and predicting tax office efficiency, providing valuable insights for improving performance.https://peerj.com/articles/cs-2672.pdfMachine learningData envelopment analysisEfficiencyTax service officeGenetic algorithmMultilayer perceptron
spellingShingle Shofinurdin Soffan
Arif Bramantoro
Ahmad A. Alzahrani
Combination of machine learning and data envelopment analysis to measure the efficiency of the Tax Service Office
PeerJ Computer Science
Machine learning
Data envelopment analysis
Efficiency
Tax service office
Genetic algorithm
Multilayer perceptron
title Combination of machine learning and data envelopment analysis to measure the efficiency of the Tax Service Office
title_full Combination of machine learning and data envelopment analysis to measure the efficiency of the Tax Service Office
title_fullStr Combination of machine learning and data envelopment analysis to measure the efficiency of the Tax Service Office
title_full_unstemmed Combination of machine learning and data envelopment analysis to measure the efficiency of the Tax Service Office
title_short Combination of machine learning and data envelopment analysis to measure the efficiency of the Tax Service Office
title_sort combination of machine learning and data envelopment analysis to measure the efficiency of the tax service office
topic Machine learning
Data envelopment analysis
Efficiency
Tax service office
Genetic algorithm
Multilayer perceptron
url https://peerj.com/articles/cs-2672.pdf
work_keys_str_mv AT shofinurdinsoffan combinationofmachinelearninganddataenvelopmentanalysistomeasuretheefficiencyofthetaxserviceoffice
AT arifbramantoro combinationofmachinelearninganddataenvelopmentanalysistomeasuretheefficiencyofthetaxserviceoffice
AT ahmadaalzahrani combinationofmachinelearninganddataenvelopmentanalysistomeasuretheefficiencyofthetaxserviceoffice