A Deep Learning–Enabled Workflow to Estimate Real-World Progression-Free Survival in Patients With Metastatic Breast Cancer: Study Using Deidentified Electronic Health Records

Abstract BackgroundProgression-free survival (PFS) is a crucial endpoint in cancer drug research. Clinician-confirmed cancer progression, namely real-world PFS (rwPFS) in unstructured text (ie, clinical notes), serves as a reasonable surrogate for real-world indicators in asce...

Full description

Saved in:
Bibliographic Details
Main Authors: Gowtham Varma, Rohit Kumar Yenukoti, Praveen Kumar M, Bandlamudi Sai Ashrit, K Purushotham, C Subash, Sunil Kumar Ravi, Verghese Kurien, Avinash Aman, Mithun Manoharan, Shashank Jaiswal, Akash Anand, Rakesh Barve, Viswanathan Thiagarajan, Patrick Lenehan, Scott A Soefje, Venky Soundararajan
Format: Article
Language:English
Published: JMIR Publications 2025-05-01
Series:JMIR Cancer
Online Access:https://cancer.jmir.org/2025/1/e64697
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849328722117656576
author Gowtham Varma
Rohit Kumar Yenukoti
Praveen Kumar M
Bandlamudi Sai Ashrit
K Purushotham
C Subash
Sunil Kumar Ravi
Verghese Kurien
Avinash Aman
Mithun Manoharan
Shashank Jaiswal
Akash Anand
Rakesh Barve
Viswanathan Thiagarajan
Patrick Lenehan
Scott A Soefje
Venky Soundararajan
author_facet Gowtham Varma
Rohit Kumar Yenukoti
Praveen Kumar M
Bandlamudi Sai Ashrit
K Purushotham
C Subash
Sunil Kumar Ravi
Verghese Kurien
Avinash Aman
Mithun Manoharan
Shashank Jaiswal
Akash Anand
Rakesh Barve
Viswanathan Thiagarajan
Patrick Lenehan
Scott A Soefje
Venky Soundararajan
author_sort Gowtham Varma
collection DOAJ
description Abstract BackgroundProgression-free survival (PFS) is a crucial endpoint in cancer drug research. Clinician-confirmed cancer progression, namely real-world PFS (rwPFS) in unstructured text (ie, clinical notes), serves as a reasonable surrogate for real-world indicators in ascertaining progression endpoints. Response evaluation criteria in solid tumors (RECIST) is traditionally used in clinical trials using serial imaging evaluations but is impractical when working with real-world data. Manual abstraction of clinical progression from unstructured notes remains the gold standard. However, this process is a resource-intensive, time-consuming process. Natural language processing (NLP), a subdomain of machine learning, has shown promise in accelerating the extraction of tumor progression from real-world data in recent years. ObjectivesWe aim to configure a pretrained, general-purpose health care NLP framework to transform free-text clinical notes and radiology reports into structured progression events for studying rwPFS on metastatic breast cancer (mBC) cohorts. MethodsThis study developed and validated a novel semiautomated workflow to estimate rwPFS in patients with mBC using deidentified electronic health record data from the Nference nSights platform. The developed workflow was validated in a cohort of 316 patients with hormone receptor–positive, human epidermal growth factor receptor-2 (HER-2) 2-negative mBC, who were started on palbociclib and letrozole combination therapy between January 2015 and December 2021. Ground-truth datasets were curated to evaluate the workflow’s performance at both the sentence and patient levels. NLP-captured progression or a change in therapy line were considered outcome events, while death, loss to follow-up, and end of the study period were considered censoring events for rwPFS computation. Peak reduction and cumulative decline in Patient Health Questionnaire-8 (PHQ-8) scores were analyzed in the progressed and nonprogressed patient subgroups. ResultsThe configured clinical NLP engine achieved a sentence-level progression capture accuracy of 98.2%. At the patient level, initial progression was captured within ±30 days with 88% accuracy. The median rwPFS for the study cohort (N=316) was 20 (95% CI 18-25) months. In a validation subset (n=100), rwPFS determined by manual curation was 25 (95% CI 15-35) months, closely aligning with the computational workflow’s 22 (95% CI 15-35) months. A subanalysis revealed rwPFS estimates of 30 (95% CI 24-39) months from radiology reports and 23 (95% CI 19-28) months from clinical notes, highlighting the importance of integrating multiple note sources. External validation also demonstrated high accuracy (92.5% sentence level; 90.2% patient level). Sensitivity analysis revealed stable rwPFS estimates across varying levels of missing source data and event definitions. Peak reduction in PHQ-8 scores during the study period highlighted significant associations between patient-reported outcomes and disease progression. ConclusionsThis workflow enables rapid and reliable determination of rwPFS in patients with mBC receiving combination therapy. Further validation across more diverse external datasets and other cancer types is needed to ensure broader applicability and generalizability.
format Article
id doaj-art-13ac616b52844a009238eb2778b11d78
institution Kabale University
issn 2369-1999
language English
publishDate 2025-05-01
publisher JMIR Publications
record_format Article
series JMIR Cancer
spelling doaj-art-13ac616b52844a009238eb2778b11d782025-08-20T03:47:32ZengJMIR PublicationsJMIR Cancer2369-19992025-05-0111e64697e6469710.2196/64697A Deep Learning–Enabled Workflow to Estimate Real-World Progression-Free Survival in Patients With Metastatic Breast Cancer: Study Using Deidentified Electronic Health RecordsGowtham Varmahttp://orcid.org/0009-0002-0641-7336Rohit Kumar Yenukotihttp://orcid.org/0009-0009-4815-3448Praveen Kumar Mhttp://orcid.org/0000-0002-9318-1167Bandlamudi Sai Ashrithttp://orcid.org/0009-0008-8876-6065K Purushothamhttp://orcid.org/0009-0002-5538-5588C Subashhttp://orcid.org/0009-0008-8490-1231Sunil Kumar Ravihttp://orcid.org/0009-0009-4102-9495Verghese Kurienhttp://orcid.org/0009-0000-6682-8765Avinash Amanhttp://orcid.org/0009-0001-0941-3149Mithun Manoharanhttp://orcid.org/0009-0006-4094-9220Shashank Jaiswalhttp://orcid.org/0009-0009-8536-3700Akash Anandhttp://orcid.org/0009-0002-0022-105XRakesh Barvehttp://orcid.org/0009-0003-4501-8167Viswanathan Thiagarajanhttp://orcid.org/0009-0003-8435-8893Patrick Lenehanhttp://orcid.org/0000-0002-1950-9179Scott A Soefjehttp://orcid.org/0000-0002-5486-7748Venky Soundararajanhttp://orcid.org/0000-0001-7434-9211 Abstract BackgroundProgression-free survival (PFS) is a crucial endpoint in cancer drug research. Clinician-confirmed cancer progression, namely real-world PFS (rwPFS) in unstructured text (ie, clinical notes), serves as a reasonable surrogate for real-world indicators in ascertaining progression endpoints. Response evaluation criteria in solid tumors (RECIST) is traditionally used in clinical trials using serial imaging evaluations but is impractical when working with real-world data. Manual abstraction of clinical progression from unstructured notes remains the gold standard. However, this process is a resource-intensive, time-consuming process. Natural language processing (NLP), a subdomain of machine learning, has shown promise in accelerating the extraction of tumor progression from real-world data in recent years. ObjectivesWe aim to configure a pretrained, general-purpose health care NLP framework to transform free-text clinical notes and radiology reports into structured progression events for studying rwPFS on metastatic breast cancer (mBC) cohorts. MethodsThis study developed and validated a novel semiautomated workflow to estimate rwPFS in patients with mBC using deidentified electronic health record data from the Nference nSights platform. The developed workflow was validated in a cohort of 316 patients with hormone receptor–positive, human epidermal growth factor receptor-2 (HER-2) 2-negative mBC, who were started on palbociclib and letrozole combination therapy between January 2015 and December 2021. Ground-truth datasets were curated to evaluate the workflow’s performance at both the sentence and patient levels. NLP-captured progression or a change in therapy line were considered outcome events, while death, loss to follow-up, and end of the study period were considered censoring events for rwPFS computation. Peak reduction and cumulative decline in Patient Health Questionnaire-8 (PHQ-8) scores were analyzed in the progressed and nonprogressed patient subgroups. ResultsThe configured clinical NLP engine achieved a sentence-level progression capture accuracy of 98.2%. At the patient level, initial progression was captured within ±30 days with 88% accuracy. The median rwPFS for the study cohort (N=316) was 20 (95% CI 18-25) months. In a validation subset (n=100), rwPFS determined by manual curation was 25 (95% CI 15-35) months, closely aligning with the computational workflow’s 22 (95% CI 15-35) months. A subanalysis revealed rwPFS estimates of 30 (95% CI 24-39) months from radiology reports and 23 (95% CI 19-28) months from clinical notes, highlighting the importance of integrating multiple note sources. External validation also demonstrated high accuracy (92.5% sentence level; 90.2% patient level). Sensitivity analysis revealed stable rwPFS estimates across varying levels of missing source data and event definitions. Peak reduction in PHQ-8 scores during the study period highlighted significant associations between patient-reported outcomes and disease progression. ConclusionsThis workflow enables rapid and reliable determination of rwPFS in patients with mBC receiving combination therapy. Further validation across more diverse external datasets and other cancer types is needed to ensure broader applicability and generalizability.https://cancer.jmir.org/2025/1/e64697
spellingShingle Gowtham Varma
Rohit Kumar Yenukoti
Praveen Kumar M
Bandlamudi Sai Ashrit
K Purushotham
C Subash
Sunil Kumar Ravi
Verghese Kurien
Avinash Aman
Mithun Manoharan
Shashank Jaiswal
Akash Anand
Rakesh Barve
Viswanathan Thiagarajan
Patrick Lenehan
Scott A Soefje
Venky Soundararajan
A Deep Learning–Enabled Workflow to Estimate Real-World Progression-Free Survival in Patients With Metastatic Breast Cancer: Study Using Deidentified Electronic Health Records
JMIR Cancer
title A Deep Learning–Enabled Workflow to Estimate Real-World Progression-Free Survival in Patients With Metastatic Breast Cancer: Study Using Deidentified Electronic Health Records
title_full A Deep Learning–Enabled Workflow to Estimate Real-World Progression-Free Survival in Patients With Metastatic Breast Cancer: Study Using Deidentified Electronic Health Records
title_fullStr A Deep Learning–Enabled Workflow to Estimate Real-World Progression-Free Survival in Patients With Metastatic Breast Cancer: Study Using Deidentified Electronic Health Records
title_full_unstemmed A Deep Learning–Enabled Workflow to Estimate Real-World Progression-Free Survival in Patients With Metastatic Breast Cancer: Study Using Deidentified Electronic Health Records
title_short A Deep Learning–Enabled Workflow to Estimate Real-World Progression-Free Survival in Patients With Metastatic Breast Cancer: Study Using Deidentified Electronic Health Records
title_sort deep learning enabled workflow to estimate real world progression free survival in patients with metastatic breast cancer study using deidentified electronic health records
url https://cancer.jmir.org/2025/1/e64697
work_keys_str_mv AT gowthamvarma adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT rohitkumaryenukoti adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT praveenkumarm adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT bandlamudisaiashrit adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT kpurushotham adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT csubash adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT sunilkumarravi adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT verghesekurien adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT avinashaman adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT mithunmanoharan adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT shashankjaiswal adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT akashanand adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT rakeshbarve adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT viswanathanthiagarajan adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT patricklenehan adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT scottasoefje adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT venkysoundararajan adeeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT gowthamvarma deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT rohitkumaryenukoti deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT praveenkumarm deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT bandlamudisaiashrit deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT kpurushotham deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT csubash deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT sunilkumarravi deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT verghesekurien deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT avinashaman deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT mithunmanoharan deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT shashankjaiswal deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT akashanand deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT rakeshbarve deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT viswanathanthiagarajan deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT patricklenehan deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT scottasoefje deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords
AT venkysoundararajan deeplearningenabledworkflowtoestimaterealworldprogressionfreesurvivalinpatientswithmetastaticbreastcancerstudyusingdeidentifiedelectronichealthrecords