Diagnostic Accuracy of a Machine Learning-Derived Appendicitis Score in Children: A Multicenter Validation Study

<b>Background</b>: Accurate diagnosis of acute appendicitis in children remains challenging due to variable presentations and limitations of existing clinical scoring systems. While machine learning (ML) offers a promising approach to enhance diagnostic precision, most prior studies have...

Full description

Saved in:
Bibliographic Details
Main Authors: Emrah Aydın, Taha Eren Sarnıç, İnan Utku Türkmen, Narmina Khanmammadova, Ufuk Ateş, Mustafa Onur Öztan, Tamer Sekmenli, Necip Fazıl Aras, Tülin Öztaş, Ali Yalçınkaya, Murat Özbek, Deniz Gökçe, Hatice Sonay Yalçın Cömert, Osman Uzunlu, Aliye Kandırıcı, Nazile Ertürk, Alev Süzen, Fatih Akova, Mehmet Paşaoğlu, Egemen Eroğlu, Gülnur Göllü Bahadır, Ahmet Murat Çakmak, Salim Bilici, Ramazan Karabulut, Mustafa İmamoğlu, Haluk Sarıhan, Süleyman Cüneyt Karakuş
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Children
Subjects:
Online Access:https://www.mdpi.com/2227-9067/12/7/937
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<b>Background</b>: Accurate diagnosis of acute appendicitis in children remains challenging due to variable presentations and limitations of existing clinical scoring systems. While machine learning (ML) offers a promising approach to enhance diagnostic precision, most prior studies have been limited by small sample sizes, single-center data, or a lack of external validation. <b>Methods</b>: This prospective, multicenter study included 8586 pediatric patients to develop a machine learning-based diagnostic model using routinely available clinical and hematological parameters. A separate, prospectively collected external validation cohort of 3000 patients was used to assess model performance. The Random Forest algorithm was selected based on its superior performance during model comparison. Diagnostic accuracy, sensitivity, specificity, Area Under Curve (AUC), and calibration metrics were evaluated and compared with traditional scoring systems such as Pediatric Appendicitis Score (PAS), Alvarado, and Appendicitis Inflammatory Response Score (AIRS). <b>Results</b>: The ML model outperformed traditional clinical scores in both development and validation cohorts. In the external validation set, the Random Forest model achieved an AUC of 0.996, accuracy of 0.992, sensitivity of 0.998, and specificity of 0.993. Feature-importance analysis identified white blood cell count, red blood cell count, and mean platelet volume as key predictors. <b>Conclusions</b>: This large, prospectively validated study demonstrates that a machine learning-based scoring system using commonly accessible data can significantly improve the diagnosis of pediatric appendicitis. The model offers high accuracy and clinical interpretability and has the potential to reduce diagnostic delays and unnecessary imaging.
ISSN:2227-9067