Prediction of recurrence after surgery for pituitary adenoma using machine learning- based models: systematic review and meta-analysis

Abstract Background Predicting pituitary adenoma (PA) recurrence after surgical resection is critical for guiding clinical decision-making, and machine learning (ML) based models show great promise in improving the accuracy of these predictions. These models can provide valuable insights to surgeons...

Full description

Saved in:
Bibliographic Details
Main Authors: Ibrahim Mohammadzadeh, Bardia Hajikarimloo, Behnaz Niroomand, Nasira Faizi, Pooya Eini, Mohammad Amin Habibi, Alireza Mohseni, Mohammadmahdi Sabahi, Abdulrahman Albakr, Michael Karsy, Hamid Borghei-Razavi
Format: Article
Language:English
Published: BMC 2025-07-01
Series:BMC Endocrine Disorders
Subjects:
Online Access:https://doi.org/10.1186/s12902-025-01955-8
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Predicting pituitary adenoma (PA) recurrence after surgical resection is critical for guiding clinical decision-making, and machine learning (ML) based models show great promise in improving the accuracy of these predictions. These models can provide valuable insights to surgeons and oncologists, helping them tailor personalized treatment plans, enhance patient prognostication, and optimize follow-up strategies. Methods We systematically searched PubMed, Scopus, Embase, Cochrane Library, and Web of Science databases until November 2024, applying PRISMA guidelines. Results Out of 1240 studies screened, six met our eligibility criteria involving ML-based approaches to predict PA recurrence. The studies employed 12 different ML algorithms. Meta-analysis showed a pooled sensitivity of 0.87 [95% CI: 0.78–0.92], specificity of 0.86 [95% CI: 0.67–0.95], positive diagnostic likelihood ratio (DLR) of 6.32 [95% CI: 2.46–16.26], and negative DLR of 0.16 [95% CI: 0.1–0.25]. The diagnostic odds ratio (DOR) was 40.52 [95% CI: 13–126.27], and the diagnostic score was 3.7 [95% CI: 2.57–4.84]. The pooled AUC was 0.89 [95% CI: 0.86–0.92], indicating a high overall diagnostic performance. For the comparison between Logistic Regression (LR) and non-LR algorithms, LR-based algorithms exhibited numerically higher AUC and sensitivity; however, these differences were not statistically significant. Additionally, LR-based algorithms showed lower specificity, positive likelihood ratio, and diagnostic odds ratios, but the statistical tests did not provide strong evidence for meaningful differences. Conclusion AI-based models show strong predictive power for recurrence in both functional and non-functional pituitary adenomas, with an average accuracy above 80%. However, the lack of external validation and the complexity of input data pose challenges, highlighting the need for rigorous validation with multi-center datasets and standardized imaging techniques to enhance clinical applicability.
ISSN:1472-6823