Attribution-based interpretable classification neural network with global and local perspectives

Abstract Neural networks are challenging to apply in domains requiring high reliability due to their black-box nature, and researchers are increasingly focusing on interpreting neural networks. While pursuing neural network performance, most methods often sacrifice interpretability by interpreting t...

Full description

Saved in:
Bibliographic Details
Main Authors: Zihao Shi, Zuqiang Meng, Haiming Tuo, Chaohong Tan
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-06218-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849766298235437056
author Zihao Shi
Zuqiang Meng
Haiming Tuo
Chaohong Tan
author_facet Zihao Shi
Zuqiang Meng
Haiming Tuo
Chaohong Tan
author_sort Zihao Shi
collection DOAJ
description Abstract Neural networks are challenging to apply in domains requiring high reliability due to their black-box nature, and researchers are increasingly focusing on interpreting neural networks. While pursuing neural network performance, most methods often sacrifice interpretability by interpreting the model after training, which is often local and does not provide more detailed information. To obtain both great interpretability and classification performance, we propose an attribution-based interpretable classification model for tabular data, that maps the intermediate output to the interpretable data representation space and automatically selects the corresponding feature values for classification and interpretation. It can assign an importance value to each input feature of an instance to achieve local interpretability while also reflecting the global importance of input features. Furthermore, we propose different training methods. While finding the best way to train the model, we discover there is a trade-off between classification performance and interpretability. Experimental results on eight open-source datasets show that our method is comparable to the competitive black-box neural networks concerning classification accuracy. Regarding two metrics of attribution methods, Reverse Precision and Generality, our model outperforms two popular post-hoc interpretable methods.
format Article
id doaj-art-cb77fab8f438473ea5019e9d117128a7
institution DOAJ
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-cb77fab8f438473ea5019e9d117128a72025-08-20T03:04:38ZengNature PortfolioScientific Reports2045-23222025-07-0115111810.1038/s41598-025-06218-zAttribution-based interpretable classification neural network with global and local perspectivesZihao Shi0Zuqiang Meng1Haiming Tuo2Chaohong Tan3Guangxi University, College of Computer, Electronics and InformationGuangxi University, College of Computer, Electronics and InformationGuangxi University, College of Computer, Electronics and InformationGuangxi Zhuang Autonomous Region Information Center, Guangxi Key Laboratory of Digital InfrastructureAbstract Neural networks are challenging to apply in domains requiring high reliability due to their black-box nature, and researchers are increasingly focusing on interpreting neural networks. While pursuing neural network performance, most methods often sacrifice interpretability by interpreting the model after training, which is often local and does not provide more detailed information. To obtain both great interpretability and classification performance, we propose an attribution-based interpretable classification model for tabular data, that maps the intermediate output to the interpretable data representation space and automatically selects the corresponding feature values for classification and interpretation. It can assign an importance value to each input feature of an instance to achieve local interpretability while also reflecting the global importance of input features. Furthermore, we propose different training methods. While finding the best way to train the model, we discover there is a trade-off between classification performance and interpretability. Experimental results on eight open-source datasets show that our method is comparable to the competitive black-box neural networks concerning classification accuracy. Regarding two metrics of attribution methods, Reverse Precision and Generality, our model outperforms two popular post-hoc interpretable methods.https://doi.org/10.1038/s41598-025-06218-z
spellingShingle Zihao Shi
Zuqiang Meng
Haiming Tuo
Chaohong Tan
Attribution-based interpretable classification neural network with global and local perspectives
Scientific Reports
title Attribution-based interpretable classification neural network with global and local perspectives
title_full Attribution-based interpretable classification neural network with global and local perspectives
title_fullStr Attribution-based interpretable classification neural network with global and local perspectives
title_full_unstemmed Attribution-based interpretable classification neural network with global and local perspectives
title_short Attribution-based interpretable classification neural network with global and local perspectives
title_sort attribution based interpretable classification neural network with global and local perspectives
url https://doi.org/10.1038/s41598-025-06218-z
work_keys_str_mv AT zihaoshi attributionbasedinterpretableclassificationneuralnetworkwithglobalandlocalperspectives
AT zuqiangmeng attributionbasedinterpretableclassificationneuralnetworkwithglobalandlocalperspectives
AT haimingtuo attributionbasedinterpretableclassificationneuralnetworkwithglobalandlocalperspectives
AT chaohongtan attributionbasedinterpretableclassificationneuralnetworkwithglobalandlocalperspectives