Comparative analysis of convolutional neural networks and traditional machine learning models for IVF live birth prediction: a retrospective analysis of 48514 IVF cycles and an evaluation of deployment feasibility in resource-constrained settings

ObjectiveTo evaluate the predictive performance of a convolutional neural network for analyzing electronic medical records in assisted reproductive therapy and to compare its accuracy and interpretability with traditional machine learning models. The study also explores the feasibility of deploying...

Full description

Saved in:
Bibliographic Details
Main Authors: Yu Liu, Yi Wang, Kai Huang, Hao Shi, Hang Xin, Shanjun Dai, Jinhao Liu, Xinhong Yang, Jianyuan Song, Fuli Zhang, Yihong Guo
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-06-01
Series:Frontiers in Endocrinology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fendo.2025.1556681/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:ObjectiveTo evaluate the predictive performance of a convolutional neural network for analyzing electronic medical records in assisted reproductive therapy and to compare its accuracy and interpretability with traditional machine learning models. The study also explores the feasibility of deploying such models in resource-limited clinical settings.DesignRetrospective cohort study based on EMR data using five models: CNN, Naïve Bayes, Random Forest, Decision Tree, and Feedforward Neural Network. Feature importance and model interpretability were evaluated using SHAP.SettingFirst Hospital of Zhengzhou University.Population48,514 fresh IVF cycles from August 2009 to May 2018.MethodsPreprocessed EMR data were used to train and evaluate five classification models predicting live birth outcomes. Stratified 5-fold cross-validation was performed for robust performance estimation. ROC curves and AUC values were used for comparative evaluation.Main Outcome MeasureLive birth.ResultsThe CNN model achieved an accuracy of 0.9394 ± 0.0013, AUC of 0.8899 ± 0.0032, precision of 0.9348 ± 0.0018, recall of 0.9993 ± 0.0012, and F1 score of 0.9660 ± 0.0007. Its performance was comparable to Random Forest (accuracy: 0.9406 ± 0.0017, AUC: 0.9734 ± 0.0012), and superior to Decision Tree, Naïve Bayes, and Feedforward Neural Network in recall and robustness. CNN demonstrated stable convergence during training, and SHAP-based interpretation highlighted maternal age, BMI, antral follicle count, and gonadotropin dosage as the top predictors for live birth outcome.ConclusionsWith appropriate input transformation, CNNs can effectively model structured EMR data and offer predictive performance comparable to ensemble methods. Their scalability, high sensitivity, and interpretability make CNNs promising candidates for integration into clinical workflows, particularly in environments with limited computational resources.
ISSN:1664-2392