Prediction of Metastasis in Paragangliomas and Pheochromocytomas Using Machine Learning Models: Explainability Challenges

One of the main issues with paragangliomas and pheochromocytomas is that these tumors have up to a 20% rate of metastatic disease, which cannot be reliably predicted. While machine learning models hold great promise for enhancing predictive accuracy, their often opaque nature limits trust and adopti...

Full description

Saved in:
Bibliographic Details
Main Authors: Carmen García-Barceló, David Gil, David Tomás, David Bernabeu
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/13/4184
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849427518932647936
author Carmen García-Barceló
David Gil
David Tomás
David Bernabeu
author_facet Carmen García-Barceló
David Gil
David Tomás
David Bernabeu
author_sort Carmen García-Barceló
collection DOAJ
description One of the main issues with paragangliomas and pheochromocytomas is that these tumors have up to a 20% rate of metastatic disease, which cannot be reliably predicted. While machine learning models hold great promise for enhancing predictive accuracy, their often opaque nature limits trust and adoption in critical fields such as healthcare. Understanding the factors driving predictions is essential not only for validating their reliability but also for enabling their integration into clinical decision-making. In this paper, we propose an architecture that combines data mining, machine learning, and explainability techniques to improve predictions of metastatic disease in these types of cancer and enhance trust in the models. A wide variety of algorithms have been applied for the development of predictive models, with a focus on interpreting their outputs to support clinical insights. Our methodology involves a comprehensive preprocessing phase to prepare the data, followed by the application of classification algorithms. Explainability techniques were integrated to provide insights into the key factors driving predictions. Additionally, a feature selection process was performed to identify the most influential variables and explore how their inclusion affects model performance. The best-performing algorithm, Random Forest, achieved an accuracy of 96.3%, precision of 96.5%, and AUC of 0.963, among other metrics, combining strong predictive capability with explainability that fosters trust in clinical applications.
format Article
id doaj-art-bc12f15836f54cd48949edecedfd09ce
institution Kabale University
issn 1424-8220
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-bc12f15836f54cd48949edecedfd09ce2025-08-20T03:28:59ZengMDPI AGSensors1424-82202025-07-012513418410.3390/s25134184Prediction of Metastasis in Paragangliomas and Pheochromocytomas Using Machine Learning Models: Explainability ChallengesCarmen García-Barceló0David Gil1David Tomás2David Bernabeu3University Institute for Computer Research, University of Alicante, Carretera San Vicente del Raspeig s/n, 03690 San Vicente del Raspeig, SpainUniversity Institute for Computer Research, University of Alicante, Carretera San Vicente del Raspeig s/n, 03690 San Vicente del Raspeig, SpainUniversity Institute for Computer Research, University of Alicante, Carretera San Vicente del Raspeig s/n, 03690 San Vicente del Raspeig, SpainUniversity Institute for Computer Research, University of Alicante, Carretera San Vicente del Raspeig s/n, 03690 San Vicente del Raspeig, SpainOne of the main issues with paragangliomas and pheochromocytomas is that these tumors have up to a 20% rate of metastatic disease, which cannot be reliably predicted. While machine learning models hold great promise for enhancing predictive accuracy, their often opaque nature limits trust and adoption in critical fields such as healthcare. Understanding the factors driving predictions is essential not only for validating their reliability but also for enabling their integration into clinical decision-making. In this paper, we propose an architecture that combines data mining, machine learning, and explainability techniques to improve predictions of metastatic disease in these types of cancer and enhance trust in the models. A wide variety of algorithms have been applied for the development of predictive models, with a focus on interpreting their outputs to support clinical insights. Our methodology involves a comprehensive preprocessing phase to prepare the data, followed by the application of classification algorithms. Explainability techniques were integrated to provide insights into the key factors driving predictions. Additionally, a feature selection process was performed to identify the most influential variables and explore how their inclusion affects model performance. The best-performing algorithm, Random Forest, achieved an accuracy of 96.3%, precision of 96.5%, and AUC of 0.963, among other metrics, combining strong predictive capability with explainability that fosters trust in clinical applications.https://www.mdpi.com/1424-8220/25/13/4184machine learningexplainabilitydata scienceclassificationfeature selectiontumor
spellingShingle Carmen García-Barceló
David Gil
David Tomás
David Bernabeu
Prediction of Metastasis in Paragangliomas and Pheochromocytomas Using Machine Learning Models: Explainability Challenges
Sensors
machine learning
explainability
data science
classification
feature selection
tumor
title Prediction of Metastasis in Paragangliomas and Pheochromocytomas Using Machine Learning Models: Explainability Challenges
title_full Prediction of Metastasis in Paragangliomas and Pheochromocytomas Using Machine Learning Models: Explainability Challenges
title_fullStr Prediction of Metastasis in Paragangliomas and Pheochromocytomas Using Machine Learning Models: Explainability Challenges
title_full_unstemmed Prediction of Metastasis in Paragangliomas and Pheochromocytomas Using Machine Learning Models: Explainability Challenges
title_short Prediction of Metastasis in Paragangliomas and Pheochromocytomas Using Machine Learning Models: Explainability Challenges
title_sort prediction of metastasis in paragangliomas and pheochromocytomas using machine learning models explainability challenges
topic machine learning
explainability
data science
classification
feature selection
tumor
url https://www.mdpi.com/1424-8220/25/13/4184
work_keys_str_mv AT carmengarciabarcelo predictionofmetastasisinparagangliomasandpheochromocytomasusingmachinelearningmodelsexplainabilitychallenges
AT davidgil predictionofmetastasisinparagangliomasandpheochromocytomasusingmachinelearningmodelsexplainabilitychallenges
AT davidtomas predictionofmetastasisinparagangliomasandpheochromocytomasusingmachinelearningmodelsexplainabilitychallenges
AT davidbernabeu predictionofmetastasisinparagangliomasandpheochromocytomasusingmachinelearningmodelsexplainabilitychallenges