Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients
The increasing adoption of DNNs in critical domains such as healthcare, finance, and autonomous systems underscores the growing importance of explainable artificial intelligence (XAI). In these high-stakes applications, understanding the decision-making processes of models is essential for ensuring...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/13/7090 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849429009294688256 |
|---|---|
| author | Su-Ying Guo Xiu-Jun Gong |
| author_facet | Su-Ying Guo Xiu-Jun Gong |
| author_sort | Su-Ying Guo |
| collection | DOAJ |
| description | The increasing adoption of DNNs in critical domains such as healthcare, finance, and autonomous systems underscores the growing importance of explainable artificial intelligence (XAI). In these high-stakes applications, understanding the decision-making processes of models is essential for ensuring trust and safety. However, traditional DNNs often function as “black boxes,” delivering accurate predictions without providing insight into the factors driving their outputs. Expected gradients (EG) is a prominent method for making such explanations by calculating the contribution of each input feature to the final decision. Despite its effectiveness, conventional baselines used in state-of-the-art implementations of EG often lack a clear definition of what constitutes “missing” information. This study proposes DeepPrior-EG, a deep prior-guided EG framework for leveraging prior knowledge to more accurately align with the concept of missingness and enhance interpretive fidelity. It resolves the baseline misalignment by initiating gradient path integration from learned prior baselines, which are derived from the deep features of CNN layers. This approach not only mitigates feature absence artifacts but also amplifies critical feature contributions through adaptive gradient aggregation. This study further introduces two probabilistic prior modeling strategies: a multivariate Gaussian model (MGM) to capture high-dimensional feature interdependencies and a Bayesian nonparametric Gaussian mixture model (BGMM) that autonomously infers mixture complexity for heterogeneous feature distributions. An explanation-driven model retraining paradigm is also implemented to validate the robustness of the proposed framework. Comprehensive evaluations across various qualitative and quantitative metrics demonstrate its superior interpretability. The BGMM variant achieves competitive performance in attribution quality and faithfulness against existing methods. DeepPrior-EG advances the interpretability of complex models within the XAI landscape and unlocks their potential in safety-critical applications. |
| format | Article |
| id | doaj-art-9d7e3cc414a5406f92a12d020db048db |
| institution | Kabale University |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-9d7e3cc414a5406f92a12d020db048db2025-08-20T03:28:29ZengMDPI AGApplied Sciences2076-34172025-06-011513709010.3390/app15137090Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected GradientsSu-Ying Guo0Xiu-Jun Gong1College of Intelligence and Computing, Tianjin University, No. 135 Yaguan Road, Haihe Education Park, Tianjin 300354, ChinaCollege of Intelligence and Computing, Tianjin University, No. 135 Yaguan Road, Haihe Education Park, Tianjin 300354, ChinaThe increasing adoption of DNNs in critical domains such as healthcare, finance, and autonomous systems underscores the growing importance of explainable artificial intelligence (XAI). In these high-stakes applications, understanding the decision-making processes of models is essential for ensuring trust and safety. However, traditional DNNs often function as “black boxes,” delivering accurate predictions without providing insight into the factors driving their outputs. Expected gradients (EG) is a prominent method for making such explanations by calculating the contribution of each input feature to the final decision. Despite its effectiveness, conventional baselines used in state-of-the-art implementations of EG often lack a clear definition of what constitutes “missing” information. This study proposes DeepPrior-EG, a deep prior-guided EG framework for leveraging prior knowledge to more accurately align with the concept of missingness and enhance interpretive fidelity. It resolves the baseline misalignment by initiating gradient path integration from learned prior baselines, which are derived from the deep features of CNN layers. This approach not only mitigates feature absence artifacts but also amplifies critical feature contributions through adaptive gradient aggregation. This study further introduces two probabilistic prior modeling strategies: a multivariate Gaussian model (MGM) to capture high-dimensional feature interdependencies and a Bayesian nonparametric Gaussian mixture model (BGMM) that autonomously infers mixture complexity for heterogeneous feature distributions. An explanation-driven model retraining paradigm is also implemented to validate the robustness of the proposed framework. Comprehensive evaluations across various qualitative and quantitative metrics demonstrate its superior interpretability. The BGMM variant achieves competitive performance in attribution quality and faithfulness against existing methods. DeepPrior-EG advances the interpretability of complex models within the XAI landscape and unlocks their potential in safety-critical applications.https://www.mdpi.com/2076-3417/15/13/7090explainable AIexpected gradientsprior baselinefeature attributions |
| spellingShingle | Su-Ying Guo Xiu-Jun Gong Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients Applied Sciences explainable AI expected gradients prior baseline feature attributions |
| title | Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients |
| title_full | Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients |
| title_fullStr | Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients |
| title_full_unstemmed | Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients |
| title_short | Enhancing Neural Network Interpretability Through Deep Prior-Guided Expected Gradients |
| title_sort | enhancing neural network interpretability through deep prior guided expected gradients |
| topic | explainable AI expected gradients prior baseline feature attributions |
| url | https://www.mdpi.com/2076-3417/15/13/7090 |
| work_keys_str_mv | AT suyingguo enhancingneuralnetworkinterpretabilitythroughdeeppriorguidedexpectedgradients AT xiujungong enhancingneuralnetworkinterpretabilitythroughdeeppriorguidedexpectedgradients |