Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients

The increasing adoption of DNNs in critical domains such as healthcare, finance, and autonomous systems underscores the growing importance of explainable artificial intelligence (XAI). In these high-stakes applications, understanding the decision-making processes of models is essential for ensuring...

Full description

Saved in:
Bibliographic Details
Main Authors: Su-Ying Guo, Xiu-Jun Gong
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/13/7090
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849429009294688256
author Su-Ying Guo
Xiu-Jun Gong
author_facet Su-Ying Guo
Xiu-Jun Gong
author_sort Su-Ying Guo
collection DOAJ
description The increasing adoption of DNNs in critical domains such as healthcare, finance, and autonomous systems underscores the growing importance of explainable artificial intelligence (XAI). In these high-stakes applications, understanding the decision-making processes of models is essential for ensuring trust and safety. However, traditional DNNs often function as “black boxes,” delivering accurate predictions without providing insight into the factors driving their outputs. Expected gradients (EG) is a prominent method for making such explanations by calculating the contribution of each input feature to the final decision. Despite its effectiveness, conventional baselines used in state-of-the-art implementations of EG often lack a clear definition of what constitutes “missing” information. This study proposes DeepPrior-EG, a deep prior-guided EG framework for leveraging prior knowledge to more accurately align with the concept of missingness and enhance interpretive fidelity. It resolves the baseline misalignment by initiating gradient path integration from learned prior baselines, which are derived from the deep features of CNN layers. This approach not only mitigates feature absence artifacts but also amplifies critical feature contributions through adaptive gradient aggregation. This study further introduces two probabilistic prior modeling strategies: a multivariate Gaussian model (MGM) to capture high-dimensional feature interdependencies and a Bayesian nonparametric Gaussian mixture model (BGMM) that autonomously infers mixture complexity for heterogeneous feature distributions. An explanation-driven model retraining paradigm is also implemented to validate the robustness of the proposed framework. Comprehensive evaluations across various qualitative and quantitative metrics demonstrate its superior interpretability. The BGMM variant achieves competitive performance in attribution quality and faithfulness against existing methods. DeepPrior-EG advances the interpretability of complex models within the XAI landscape and unlocks their potential in safety-critical applications.
format Article
id doaj-art-9d7e3cc414a5406f92a12d020db048db
institution Kabale University
issn 2076-3417
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-9d7e3cc414a5406f92a12d020db048db2025-08-20T03:28:29ZengMDPI AGApplied Sciences2076-34172025-06-011513709010.3390/app15137090Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected GradientsSu-Ying Guo0Xiu-Jun Gong1College of Intelligence and Computing, Tianjin University, No. 135 Yaguan Road, Haihe Education Park, Tianjin 300354, ChinaCollege of Intelligence and Computing, Tianjin University, No. 135 Yaguan Road, Haihe Education Park, Tianjin 300354, ChinaThe increasing adoption of DNNs in critical domains such as healthcare, finance, and autonomous systems underscores the growing importance of explainable artificial intelligence (XAI). In these high-stakes applications, understanding the decision-making processes of models is essential for ensuring trust and safety. However, traditional DNNs often function as “black boxes,” delivering accurate predictions without providing insight into the factors driving their outputs. Expected gradients (EG) is a prominent method for making such explanations by calculating the contribution of each input feature to the final decision. Despite its effectiveness, conventional baselines used in state-of-the-art implementations of EG often lack a clear definition of what constitutes “missing” information. This study proposes DeepPrior-EG, a deep prior-guided EG framework for leveraging prior knowledge to more accurately align with the concept of missingness and enhance interpretive fidelity. It resolves the baseline misalignment by initiating gradient path integration from learned prior baselines, which are derived from the deep features of CNN layers. This approach not only mitigates feature absence artifacts but also amplifies critical feature contributions through adaptive gradient aggregation. This study further introduces two probabilistic prior modeling strategies: a multivariate Gaussian model (MGM) to capture high-dimensional feature interdependencies and a Bayesian nonparametric Gaussian mixture model (BGMM) that autonomously infers mixture complexity for heterogeneous feature distributions. An explanation-driven model retraining paradigm is also implemented to validate the robustness of the proposed framework. Comprehensive evaluations across various qualitative and quantitative metrics demonstrate its superior interpretability. The BGMM variant achieves competitive performance in attribution quality and faithfulness against existing methods. DeepPrior-EG advances the interpretability of complex models within the XAI landscape and unlocks their potential in safety-critical applications.https://www.mdpi.com/2076-3417/15/13/7090explainable AIexpected gradientsprior baselinefeature attributions
spellingShingle Su-Ying Guo
Xiu-Jun Gong
Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients
Applied Sciences
explainable AI
expected gradients
prior baseline
feature attributions
title Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients
title_full Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients
title_fullStr Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients
title_full_unstemmed Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients
title_short Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients
title_sort enhancing neural network interpretability through deep prior guided expected gradients
topic explainable AI
expected gradients
prior baseline
feature attributions
url https://www.mdpi.com/2076-3417/15/13/7090
work_keys_str_mv AT suyingguo enhancingneuralnetworkinterpretabilitythroughdeeppriorguidedexpectedgradients
AT xiujungong enhancingneuralnetworkinterpretabilitythroughdeeppriorguidedexpectedgradients