Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance

Post hoc explanations for black-box machine learning models have been criticized for potentially inaccurate surrogate models and computational burden at prediction time. We propose pre hoc and co hoc explainability frameworks that integrate interpretability directly into the training process through...

Full description

Saved in:

Bibliographic Details
Main Authors:	Cagla Acun, Olfa Nasraoui
Format:	Article
Language:	English
Published:	MDPI AG 2025-07-01
Series:	Applied Sciences
Subjects:	XAI explainability in machine learning local explainability global explainability
Online Access:	https://www.mdpi.com/2076-3417/15/13/7544
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849429120576913408
author	Cagla Acun Olfa Nasraoui
author_facet	Cagla Acun Olfa Nasraoui
author_sort	Cagla Acun
collection	DOAJ
description	Post hoc explanations for black-box machine learning models have been criticized for potentially inaccurate surrogate models and computational burden at prediction time. We propose pre hoc and co hoc explainability frameworks that integrate interpretability directly into the training process through an inherently interpretable white-box model. Pre hoc uses the white-box model to regularize the black-box model, while co hoc jointly optimizes both models with a shared loss function. We extend these frameworks to generate instance-specific explanations using Jensen–Shannon divergence as a regularization term. Our two-phase approach first trains models for fidelity, then generates local explanations through neighborhood-based fine-tuning. Experiments on credit risk scoring and movie recommendation datasets demonstrate superior global and local fidelity compared to LIME, without compromising accuracy. The co hoc framework additionally enhances white-box model accuracy by up to 3%, making it valuable for regulated domains requiring interpretable models. Our approaches provide more faithful and consistent explanations at a lower computational cost than existing methods, offering a promising direction for making machine learning models more transparent and trustworthy while maintaining high prediction accuracy.
format	Article
id	doaj-art-511eac7899b440628cea1991c09dc2c6
institution	Kabale University
issn	2076-3417
language	English
publishDate	2025-07-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-511eac7899b440628cea1991c09dc2c62025-08-20T03:28:28ZengMDPI AGApplied Sciences2076-34172025-07-011513754410.3390/app15137544Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and PerformanceCagla Acun0Olfa Nasraoui1Knowledge Discovery and Web Mining Lab, Department of Computer Science and Engineering, University of Louisville, Louisville, KY 40292, USAKnowledge Discovery and Web Mining Lab, Department of Computer Science and Engineering, University of Louisville, Louisville, KY 40292, USAPost hoc explanations for black-box machine learning models have been criticized for potentially inaccurate surrogate models and computational burden at prediction time. We propose pre hoc and co hoc explainability frameworks that integrate interpretability directly into the training process through an inherently interpretable white-box model. Pre hoc uses the white-box model to regularize the black-box model, while co hoc jointly optimizes both models with a shared loss function. We extend these frameworks to generate instance-specific explanations using Jensen–Shannon divergence as a regularization term. Our two-phase approach first trains models for fidelity, then generates local explanations through neighborhood-based fine-tuning. Experiments on credit risk scoring and movie recommendation datasets demonstrate superior global and local fidelity compared to LIME, without compromising accuracy. The co hoc framework additionally enhances white-box model accuracy by up to 3%, making it valuable for regulated domains requiring interpretable models. Our approaches provide more faithful and consistent explanations at a lower computational cost than existing methods, offering a promising direction for making machine learning models more transparent and trustworthy while maintaining high prediction accuracy.https://www.mdpi.com/2076-3417/15/13/7544XAIexplainability in machine learninglocal explainabilityglobal explainability
spellingShingle	Cagla Acun Olfa Nasraoui Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance Applied Sciences XAI explainability in machine learning local explainability global explainability
title	Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_full	Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_fullStr	Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_full_unstemmed	Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_short	Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance
title_sort	pre hoc and co hoc explainability frameworks for integrating interpretability into machine learning training for enhanced transparency and performance
topic	XAI explainability in machine learning local explainability global explainability
url	https://www.mdpi.com/2076-3417/15/13/7544
work_keys_str_mv	AT caglaacun prehocandcohocexplainabilityframeworksforintegratinginterpretabilityintomachinelearningtrainingforenhancedtransparencyandperformance AT olfanasraoui prehocandcohocexplainabilityframeworksforintegratinginterpretabilityintomachinelearningtrainingforenhancedtransparencyandperformance

Pre Hoc and Co Hoc Explainability: Frameworks for Integrating Interpretability into Machine Learning Training for Enhanced Transparency and Performance

Similar Items