A machine learning model for predicting lymph node positivity in ovarian cancer: development, validation, and clinical application

BackgroundOvarian cancer (OC) remains a highly lethal gynecological malignancy, often diagnosed at advanced stages with a poor prognosis. Lymph node involvement is a critical prognostic factor and significantly influences treatment planning. However, accurately predicting lymph node positivity remai...

Full description

Saved in:
Bibliographic Details
Main Authors: QingYong Guo, Jinji Wang, Ru Chen, LiPing Hu, Wenqiang You
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-07-01
Series:Frontiers in Oncology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fonc.2025.1527674/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:BackgroundOvarian cancer (OC) remains a highly lethal gynecological malignancy, often diagnosed at advanced stages with a poor prognosis. Lymph node involvement is a critical prognostic factor and significantly influences treatment planning. However, accurately predicting lymph node positivity remains challenging due to the disease’s heterogeneity and the limitations of traditional models in handling high-dimensional and imbalanced data.MethodsA retrospective analysis was conducted using the SEER database (2000–2021), including 26,844 OC patients with complete clinical information. We developed a machine learning model incorporating multiple algorithms, with XGBoost demonstrating superior performance. SMOTE was used to address class imbalance, and LASSO regression aided in selecting key predictors such as tumor size, histology, chemotherapy, and surgery. Model performance was assessed via accuracy, sensitivity, specificity, F1 score, and AUC, with external validation performed using an independent cohort from Fujian Provincial Maternity and Children’s Hospital.ResultsThe XGBoost model achieved an AUC of 0.98 (95% CI: 0.975–0.986) in the training set and 0.847 (95% CI: 0.823–0.871) in external validation. The model demonstrated high sensitivity and robust performance in identifying lymph node-positive cases. Tumor size ≥5 cm, histological subtype, and chemotherapy were key predictive features, with SHAP analysis identifying tumor size as the most influential factor.ConclusionWe present the first machine learning model specifically developed for predicting lymph node positivity in OC, validated across large, diverse cohorts. To facilitate clinical translation, we developed a free, user-friendly online calculator, which allows clinicians to quickly estimate lymph node positivity risk using patient-specific clinical parameters. This tool can be accessed at http://127.0.0.1:6818 and serves as a practical, evidence-based aid to support individualized treatment decisions and potentially improve patient outcomes. Future studies should integrate molecular data and broaden external validation to enhance generalizability.
ISSN:2234-943X