Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance

Post hoc explanations for black-box machine learning models have been criticized for potentially inaccurate surrogate models and computational burden at prediction time. We propose pre hoc and co hoc explainability frameworks that integrate interpretability directly into the training process through...

Full description

Saved in:
Bibliographic Details
Main Authors: Cagla Acun, Olfa Nasraoui
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/13/7544
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849429120576913408
author Cagla Acun
Olfa Nasraoui
author_facet Cagla Acun
Olfa Nasraoui
author_sort Cagla Acun
collection DOAJ
description Post hoc explanations for black-box machine learning models have been criticized for potentially inaccurate surrogate models and computational burden at prediction time. We propose pre hoc and co hoc explainability frameworks that integrate interpretability directly into the training process through an inherently interpretable white-box model. Pre hoc uses the white-box model to regularize the black-box model, while co hoc jointly optimizes both models with a shared loss function. We extend these frameworks to generate instance-specific explanations using Jensen–Shannon divergence as a regularization term. Our two-phase approach first trains models for fidelity, then generates local explanations through neighborhood-based fine-tuning. Experiments on credit risk scoring and movie recommendation datasets demonstrate superior global and local fidelity compared to LIME, without compromising accuracy. The co hoc framework additionally enhances white-box model accuracy by up to 3%, making it valuable for regulated domains requiring interpretable models. Our approaches provide more faithful and consistent explanations at a lower computational cost than existing methods, offering a promising direction for making machine learning models more transparent and trustworthy while maintaining high prediction accuracy.
format Article
id doaj-art-511eac7899b440628cea1991c09dc2c6
institution Kabale University
issn 2076-3417
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-511eac7899b440628cea1991c09dc2c62025-08-20T03:28:28ZengMDPI AGApplied Sciences2076-34172025-07-011513754410.3390/app15137544Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and PerformanceCagla Acun0Olfa Nasraoui1Knowledge Discovery and Web Mining Lab, Department of Computer Science and Engineering, University of Louisville, Louisville, KY 40292, USAKnowledge Discovery and Web Mining Lab, Department of Computer Science and Engineering, University of Louisville, Louisville, KY 40292, USAPost hoc explanations for black-box machine learning models have been criticized for potentially inaccurate surrogate models and computational burden at prediction time. We propose pre hoc and co hoc explainability frameworks that integrate interpretability directly into the training process through an inherently interpretable white-box model. Pre hoc uses the white-box model to regularize the black-box model, while co hoc jointly optimizes both models with a shared loss function. We extend these frameworks to generate instance-specific explanations using Jensen–Shannon divergence as a regularization term. Our two-phase approach first trains models for fidelity, then generates local explanations through neighborhood-based fine-tuning. Experiments on credit risk scoring and movie recommendation datasets demonstrate superior global and local fidelity compared to LIME, without compromising accuracy. The co hoc framework additionally enhances white-box model accuracy by up to 3%, making it valuable for regulated domains requiring interpretable models. Our approaches provide more faithful and consistent explanations at a lower computational cost than existing methods, offering a promising direction for making machine learning models more transparent and trustworthy while maintaining high prediction accuracy.https://www.mdpi.com/2076-3417/15/13/7544XAIexplainability in machine learninglocal explainabilityglobal explainability
spellingShingle Cagla Acun
Olfa Nasraoui
Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
Applied Sciences
XAI
explainability in machine learning
local explainability
global explainability
title Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_full Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_fullStr Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_full_unstemmed Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_short Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_sort pre hoc and co hoc explainability frameworks for integrating interpretability into machine learning training for enhanced transparency and performance
topic XAI
explainability in machine learning
local explainability
global explainability
url https://www.mdpi.com/2076-3417/15/13/7544
work_keys_str_mv AT caglaacun prehocandcohocexplainabilityframeworksforintegratinginterpretabilityintomachinelearningtrainingforenhancedtransparencyandperformance
AT olfanasraoui prehocandcohocexplainabilityframeworksforintegratinginterpretabilityintomachinelearningtrainingforenhancedtransparencyandperformance